Learning Dynamic Rope Manipulation Using Task-Level Iterative Learning Control

Chris Atkeson; Krishna Suresh

arxiv: 2602.21302 · v2 · pith:XTYUDGOZnew · submitted 2026-02-24 · 💻 cs.RO

Learning Dynamic Rope Manipulation Using Task-Level Iterative Learning Control

Krishna Suresh , Chris Atkeson This is my paper

Pith reviewed 2026-05-15 19:41 UTC · model grok-4.3

classification 💻 cs.RO

keywords iterative learning controlrope manipulationdynamic manipulationflying knotquadratic programmingtask-space controlhardware learningdeformable objects

0 comments

The pith

Task-level iterative learning control lets robots master dynamic flying knots from one human demonstration and a simplified rope model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a method that learns to perform the flying knot task by updating robot actions at each trial through quadratic-program inversion of a combined robot-and-rope model. The approach starts from a single human demonstration and then refines the motion directly on the physical robot, without needing large datasets or extensive simulation. Across seven rope types that differ in thickness, density, and material, the controller reaches 100 percent success within ten trials and can switch between most ropes in two to five additional trials. A sympathetic reader would care because the method shows that modest model-based updates can replace the usual heavy reliance on simulation or many demonstrations when handling deformable objects under dynamic conditions.

Core claim

The method inverts a simplified model of the robot and rope at each iteration by solving a quadratic program that maps observed task-space errors into corrective action updates; starting from one human demonstration, this produces a controller that reliably executes the non-planar flying-knot motion on hardware.

What carries the argument

Task-level iterative learning control that solves a quadratic program to propagate task-space errors into action updates using a simplified rope model.

If this is right

The same inversion step yields 100 percent success on every tested rope within ten hardware trials.
Most rope-to-rope transfers succeed after only two to five additional trials.
No large demonstration sets or massive simulation are required once the initial demonstration is given.
The approach works for non-planar dynamic manipulation without explicit planning of contact sequences.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same error-propagation step could be applied to other under-actuated deformable tasks such as cloth folding or cable routing.
Because updates occur directly on hardware, the method may remain effective when simulation parameters drift from real-world conditions.
If the quadratic program can be solved at higher frequency, the same framework might support online adaptation during a single continuous motion.

Load-bearing premise

The simplified rope model is accurate enough that the quadratic-program inversion produces action updates that actually reduce the observed task errors.

What would settle it

Running the learning procedure on a new rope and finding that success rate stays below 100 percent after ten trials, or that the quadratic-program updates produce no consistent improvement in knot completion.

Figures

Figures reproduced from arXiv: 2602.21302 by Chris Atkeson, Krishna Suresh.

**Figure 2.** Figure 2: Critical point for the flying knot across 4 demonstration variations. The shape of the rope at collision is used for the learning objective. a distant object, such as a cleat or tree branch by manipulating one end. The flying knot task was chosen because it is impressive but achievable by humans and fast robots. There are several types of flying knots. We focus on the Overhand Knot, in which one hand manip… view at source ↗

**Figure 3.** Figure 3: Task-Level Iterative Learning Control System: A demonstration is converted to an initial command. The command trajectory u(t) is executed on the real system, and the resulting trajectory x(t) is measured. The task error ˜x(t) at the critical point is mapped through the inverse model M−1 to command trajectory corrections ∆u(t), which are applied to the current feedforward command, closing the learning loop.… view at source ↗

**Figure 4.** Figure 4: Critical point vs equal-weighted objective learned commands. Each row is a real trial after 8 iterations with the corresponding objective. The green rope is the measured state from the real trial, and the red rope is the goal rope from the demonstration. The rope state at 0.46s is the critical point. The critical point objective trial results in a successful flying knot, while the equal-weighted objective … view at source ↗

**Figure 5.** Figure 5: Left: Kinematic robot model at various stages of a command. The opaque robot is at the point of contact. Right: Graphical representation of the point mass rope model. The robot’s fingertip trajectory kinematically drives the first red dot. The remaining links are bound by distance constraints, and each joint has stiffness and damping. Together, the robot and rope models define our system dynamics model. Al… view at source ↗

**Figure 7.** Figure 7: Initial command fingertip trajectory: positions of the demonstration fingertip trajectory are dashed, and the robot commanded initial attempt to track the demonstration are solid. The robot fails to track the demonstration motion due to kinematic and dynamic constraints. attempting to follow the demonstration trajectory. However, for certain ropes, such as 4 and 7, following the demonstration nearly achiev… view at source ↗

**Figure 8.** Figure 8: Left: Critical point objective for 7 rope types over 10 iterations (rope 3 and 7 end early due to marker tracking failures). Squares represent the first successful flying knot, and every large solid dot represents a subsequent successful knot. Right: Real successful flying knot execution on 7 rope types overlayed. Red highlighted frame is the critical point [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗

**Figure 9.** Figure 9: Rope configuration at the critical point over learning iterations: Real rope (green) and goal rope(red) states for 5 iterations of Task-Level ILC where trial 5 resulted in a flying knot. E. Learning Across Rope Types We evaluate the learning of the flying knot across 7 different ropes (as shown in [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗

**Figure 11.** Figure 11: Number of trials to transfer a successful command from rope A to [PITH_FULL_IMAGE:figures/full_fig_p008_11.png] view at source ↗

read the original abstract

We introduce a Task-Level Iterative Learning Control method for dynamic manipulation of ropes. We demonstrate this method on a non-planar rope manipulation task called the flying knot. Using a single human demonstration and a simplified rope model, the method learns directly on hardware without reliance on large amounts of demonstration data or massive amounts of simulation. At each iteration, the algorithm inverts a model of the robot and rope by solving a quadratic program to propagate task-space errors into action updates. We evaluate performance across 7 different kinds of ropes, including chain, latex surgical tubing, and braided and twisted ropes, ranging in thicknesses of 7--25\,mm and densities of 0.013--0.5\,kg/m. Learning achieves a 100\% success rate within 10 trials on all ropes. Furthermore, the method can successfully transfer between most rope types in 2--5 trials. https://flying-knots.github.io

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Task-level ILC with QP inversion of a simple rope model lets them learn flying knots from one demo on hardware and transfer across rope types in a few trials.

read the letter

The core result is that a task-level iterative learning control scheme, using quadratic program inversion on a simplified rope model, can take one human demonstration and produce reliable flying knot performance on real hardware. It reaches 100% success within 10 trials for seven ropes that differ in material, thickness, and density, and it transfers between most of them in 2-5 additional trials. That is the practical advance worth noting: direct hardware learning without large demonstration sets or heavy simulation.

Referee Report

2 major / 1 minor

Summary. The paper introduces Task-Level Iterative Learning Control (TILC) for dynamic rope manipulation on the non-planar flying knot task. Using one human demonstration and a single simplified rope model, the approach learns directly on hardware by solving a quadratic program (QP) at each iteration to invert the combined robot-rope model and map task-space errors into action updates. It reports 100% success within 10 trials across seven ropes (chain, latex, braided, twisted; 7-25 mm, 0.013-0.5 kg/m) and successful transfer between most rope types in 2-5 trials, without large demonstration sets or simulation.

Significance. If the central result holds, the work demonstrates a practical, data-efficient route to hardware-only learning for underactuated dynamic manipulation of deformable objects. The single-model, single-demo, QP-inversion design avoids the usual requirements for massive simulation or many demonstrations and shows rapid cross-rope transfer; these are concrete strengths that would be of interest to the robotics community working on deformable-object control.

major comments (2)

[Abstract] Abstract: the central claim that the simplified rope model enables reliable QP inversion for the flying knot rests on the unverified assumption that the model captures the inertial, contact, and timing dynamics sufficiently for non-planar swings. No quantitative validation (e.g., comparison of predicted vs. observed rope trajectories or residual error after QP updates) is provided to show that the model-derived corrections are the primary driver of the reported 100% success rather than hardware trial-and-error.
[Abstract] The evaluation reports 100% success within 10 trials and 2-5 trial transfer but supplies no per-rope trial statistics, failure-mode breakdown, or task-error metrics (e.g., knot completion time, position error at contact). Without these, it is impossible to assess whether the QP updates are effective or whether the result is driven by the simplified model.

minor comments (1)

[Abstract] The abstract states the method works across seven ropes but does not specify the exact model equations or the QP formulation (objective, constraints, decision variables). Adding these would allow readers to evaluate the inversion step directly.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their positive assessment of the significance of our work and for the constructive feedback. We address each of the major comments below.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the simplified rope model enables reliable QP inversion for the flying knot rests on the unverified assumption that the model captures the inertial, contact, and timing dynamics sufficiently for non-planar swings. No quantitative validation (e.g., comparison of predicted vs. observed rope trajectories or residual error after QP updates) is provided to show that the model-derived corrections are the primary driver of the reported 100% success rather than hardware trial-and-error.

Authors: We acknowledge that providing quantitative validation of the rope model's accuracy would strengthen the manuscript. In the revised version, we have added a new subsection in the experiments that compares the rope trajectories predicted by the simplified model against the observed trajectories from the robot's motion capture system for representative trials. We also report the residual errors in the QP solutions across iterations to demonstrate that the model-based updates are indeed driving the convergence to successful knotting, rather than pure trial-and-error. revision: yes
Referee: [Abstract] The evaluation reports 100% success within 10 trials and 2-5 trial transfer but supplies no per-rope trial statistics, failure-mode breakdown, or task-error metrics (e.g., knot completion time, position error at contact). Without these, it is impossible to assess whether the QP updates are effective or whether the result is driven by the simplified model.

Authors: We agree that additional details on the per-rope performance would improve the evaluation. We have revised the results section to include a table summarizing the number of trials required for success for each of the seven ropes, along with average knot completion times and position errors at the point of contact. Failure modes were primarily early-trial timing mismatches, which were corrected by the ILC updates. This data supports that the QP-based corrections are effective across rope types. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in the ILC derivation

full rationale

The paper's core derivation uses a fixed simplified rope model inverted via quadratic programming to map observed task-space errors (from hardware trials) into action updates. This inversion is a standard model-based correction step and does not reduce to the target success rate or any fitted parameter by construction. Success rates (100% within 10 trials, transfer in 2-5 trials) are reported as empirical outcomes on real hardware across 7 ropes, not as mathematical identities derived from the inputs. No self-citation chains, ansatz smuggling, or renaming of known results appear as load-bearing elements in the provided abstract and method description. The approach remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that a simplified rope model enables effective QP inversion for error correction; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)

domain assumption Simplified rope model suffices for QP-based error propagation in dynamic manipulation
Invoked to justify direct hardware learning from task-space errors without full dynamics.

pith-pipeline@v0.9.0 · 5451 in / 1183 out tokens · 21011 ms · 2026-05-15T19:41:16.455300+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

At each iteration, the algorithm inverts a model of the robot and rope by solving a quadratic program to propagate task-space errors into action updates.
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We model the rope as a 3D serial chain of point masses m connected by fixed-distance constraints of length l. Each joint in the rope has a bending stiffness k and a damping coefficient b.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages

[1]

Using inaccurate models in reinforcement learning

Pieter Abbeel, Morgan Quigley, and Andrew Y Ng. Using inaccurate models in reinforcement learning. In Proceedings of the 23rd International Conference on Machine Learning, pages 1–8, 2006

work page 2006
[2]

Potential ﬁeld methods and their inherent limitations for mobile robot navigation,

E.W. Aboaf, C.G. Atkeson, and D.J. Reinkensmeyer. Task-level robot learning. InProceedings. 1988 IEEE International Conference on Robotics and Automation, pages 1309–1310 vol.2, 1988. doi: 10.1109/ROBOT. 1988.12245

work page doi:10.1109/robot 1988
[3]

A regret minimization approach to iterative learning control

Naman Agarwal, Elad Hazan, Anirudha Majumdar, and Karan Singh. A regret minimization approach to iterative learning control. InProceedings of the 38th Interna- tional Conference on Machine Learning, pages 100–

work page
[4]

URL https://proceedings.mlr.press/v139/ agarwal21b.html

PMLR. URL https://proceedings.mlr.press/v139/ agarwal21b.html

work page
[5]

An, Christopher G

Chae H. An, Christopher G. Atkeson, and John Holler- bach.Model-Based Control of a Robot Manipulator. Artificial Intelligence Series. MIT Press. ISBN 978-0- 262-51157-5

work page
[6]

Bettering operation of robots by learning

Suguru Arimoto, Sadao Kawamura, and Fumio Miyazaki. Bettering operation of robots by learning. Journal of Robotic Systems, 1(2):123–140, 1984

work page 1984
[7]

Atkeson and J

C. Atkeson and J. McIntyre. Robot trajectory learning through practice. In1986 IEEE International Con- ference on Robotics and Automation Proceedings, vol- ume 3, pages 1737–1742. URL https://ieeexplore.ieee. org/document/1087423

work page arXiv
[8]

Bristow, M

D.A. Bristow, M. Tharayil, and A.G. Alleyne. A survey of iterative learning control.IEEE Control Systems Magazine, 26(3):96–114, 2006. doi: 10.1109/MCS.2006. 1636313

work page doi:10.1109/mcs.2006 2006
[9]

Linear-time vari- ational integrators in maximal coordinates

Jan Br ¨udigam and Zachary Manchester. Linear-time vari- ational integrators in maximal coordinates. In Steven M. LaValle, Ming Lin, Timo Ojala, Dylan Shell, and Jingjin Yu, editors,Algorithmic Foundations of Robotics XIV, volume 17, pages 194–209. Springer International Pub- lishing. doi: 10.1007/978-3-030-66723-8 12. URL http: //link.springer.com/10.100...

work page doi:10.1007/978-3-030-66723-8
[10]

Efficiently learning single-arm fling motions to smooth garments

Lawrence Yunliang Chen, Huang Huang, Ellen Novoseller, Daniel Seita, Jeffrey Ichnowski, Michael Laskey, Richard Cheng, Thomas Kollar, and Ken Goldberg. Efficiently learning single-arm fling motions to smooth garments. URL http://arxiv.org/abs/2206.08921

work page arXiv
[11]

Yiyang Chen, Bing Chu, and Christopher T. Freeman. Point-to-point iterative learning control with optimal tracking time allocation.IEEE Transactions on Con- trol Systems Technology, 26(5):1685–1698, 2018. doi: 10.1109/TCST.2017.2735358

work page doi:10.1109/tcst.2017.2735358 2018
[12]

Iterative residual policy: for goal-conditioned dynamic manipulation of deformable objects

Cheng Chi, Benjamin Burchfiel, Eric Cousineau, Siyuan Feng, and Shuran Song. Iterative residual policy: for goal-conditioned dynamic manipulation of deformable objects. URL http://arxiv.org/abs/2203.00663

work page arXiv
[13]

Adaptive terminal ilc for iteration- varying target points.Asian Journal of Control, 17(3): 952–962, 2015

Ronghu Chi, Danwei Wang, Frank L Lewis, Zhongsheng Hou, and Shangtai Jin. Adaptive terminal ilc for iteration- varying target points.Asian Journal of Control, 17(3): 952–962, 2015

work page 2015
[14]

Gill, Walter Murray, and Michael A

Philip E. Gill, Walter Murray, and Michael A. Saunders. Snopt: An sqp algorithm for large-scale constrained optimization.SIAM Journal on Optimization, 12(4):979– 1006, 2002. doi: 10.1137/S1052623499350013. URL https://doi.org/10.1137/S1052623499350013

work page doi:10.1137/s1052623499350013 2002
[15]

Goulart and Yuwen Chen

Paul J. Goulart and Yuwen Chen. Clarabel: An interior- point solver for conic programs with quadratic objectives,

work page
[16]

URL https://arxiv.org/abs/2405.12762

work page arXiv
[17]

Zico Kolter, and Zachary Manchester

Swaminathan Gurumurthy, J. Zico Kolter, and Zachary Manchester. Deep off-policy iterative learning control. In Proceedings of The 5th Annual Learning for Dynamics and Control Conference, pages 639–652. PMLR. URL https://proceedings.mlr.press/v211/gurumurthy23b.html

work page
[18]

Flingbot: The unreasonable effectiveness of dynamic manipulation for cloth unfold- ing, 2021

Huy Ha and Shuran Song. Flingbot: The unreasonable effectiveness of dynamic manipulation for cloth unfold- ing, 2021. URL https://arxiv.org/abs/2105.03655

work page arXiv 2021
[19]

Dynamic manipulation of deformable objects using imitation learning with adaptation to hard- ware constraints

Eric Hannus, Tran Nguyen Le, David Blanco-Mulero, and Ville Kyrki. Dynamic manipulation of deformable objects using imitation learning with adaptation to hard- ware constraints. In2024 IEEE/RSJ International Con- ference on Intelligent Robots and Systems (IROS), pages 12655–12662. IEEE, 2024

work page 2024
[20]

RaC: Robot learning for long-horizon tasks by scaling recovery and correction

Zheyuan Hu, Robyn Wu, Naveen Enock, Jasmine Li, Riya Kadakia, Zackory Erickson, and Aviral Kumar. RaC: Robot learning for long-horizon tasks by scaling recovery and correction. URL http://arxiv.org/abs/2509. 07953

work page
[21]

Self- supervised cloth reconstruction via action-conditioned cloth tracking

Zixuan Huang, Xingyu Lin, and David Held. Self- supervised cloth reconstruction via action-conditioned cloth tracking. InIEEE International Conference on Robotics and Automation (ICRA), 2023, 2023

work page 2023
[22]

Tae-Yong Kuc, Kwanghee Nam, and J.S. Lee. An iterative learning control of robot manipulators.IEEE Transactions on Robotics and Automation, 7(6):835–842,

work page
[23]

doi: 10.1109/70.105392

work page doi:10.1109/70.105392
[24]

Varadarajan, A

Vincent Lim, Huang Huang, Lawrence Yunliang Chen, Jonathan Wang, Jeffrey Ichnowski, Daniel Seita, Michael Laskey, and Ken Goldberg. Real2sim2real: Self- supervised learning of physical single-step dynamic ac- tions for planar robot casting. In2022 International Conference on Robotics and Automation (ICRA), pages 8282–8289. doi: 10.1109/ICRA46639.2022.9811...

work page doi:10.1109/icra46639.2022.9811651 2022
[25]

Event-triggered nonlinear iterative learning control.IEEE Transactions on Neural Networks and Learning Systems, 32(11):5118–5128, 2020

Na Lin, Ronghu Chi, Biao Huang, and Zhongsheng Hou. Event-triggered nonlinear iterative learning control.IEEE Transactions on Neural Networks and Learning Systems, 32(11):5118–5128, 2020

work page 2020
[26]

Iterative learning control: A survey and new results.Journal of Robotic Systems, 9(5):563–594, 1992

Kevin L Moore, Mohammed Dahleh, and SP Bhat- tacharyya. Iterative learning control: A survey and new results.Journal of Robotic Systems, 9(5):563–594, 1992

work page 1992
[27]

Nah, Aleksei Krotov, Marta Russo, Dagmar Sternad, and Neville Hogan

Moses C. Nah, Aleksei Krotov, Marta Russo, Dagmar Sternad, and Neville Hogan. Dynamic primitives facili- tate manipulating a whip. In2020 8th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob), pages 685–691, Nov 2020. doi: 10.1109/BioRob49111.2020.9224399

work page doi:10.1109/biorob49111.2020.9224399 2020
[28]

Com- bining self-supervised learning and imitation for vision- based rope manipulation

Ashvin Nair, Dian Chen, Pulkit Agrawal, Phillip Isola, Pieter Abbeel, Jitendra Malik, and Sergey Levine. Com- bining self-supervised learning and imitation for vision- based rope manipulation. In2017 IEEE International Conference on Robotics and Automation (ICRA), pages 2146–2153, 2017. doi: 10.1109/ICRA.2017.7989247

work page doi:10.1109/icra.2017.7989247 2017
[29]

Ratcliffe, Paul L

James D. Ratcliffe, Paul L. Lewin, Eric Rogers, Jari J. Hatonen, and David H. Owens. Norm-optimal iterative learning control applied to gantry robots for automation applications.IEEE Transactions on Robotics, 22(6): 1303–1307, 2006. doi: 10.1109/TRO.2006.882927

work page doi:10.1109/tro.2006.882927 2006
[30]

Sukhatme

Gautam Salhotra, I-Chun Arthur Liu, Marcus Dominguez-Kuhne, and Gaurav S. Sukhatme. Learning deformable object manipulation from expert demonstrations.IEEE Robotics and Automation Letters, 7(4):8775–8782, 2022. doi: 10.1109/LRA.2022.3187843

work page doi:10.1109/lra.2022.3187843 2022
[31]

Schoellig, Fabian L

Angela P. Schoellig, Fabian L. Mueller, and Raffaello D’Andrea. Optimization-based iterative learning for precise quadrocopter trajectory tracking. 33(1):103–127. doi: 10.1007/s10514-012-9283-2. URL https://doi.org/ 10.1007/s10514-012-9283-2

work page doi:10.1007/s10514-012-9283-2
[32]

Optimization- based iterative learning control for trajectory tracking

Angela Sch ¨ollig and Raffaello D’Andrea. Optimization- based iterative learning control for trajectory tracking. In 2009 European Control Conference (ECC), pages 1505– 1510, 2009. doi: 10.23919/ECC.2009.7074619

work page doi:10.23919/ecc.2009.7074619 2009
[33]

A framework for manipulating deformable linear ob- jects by coherent point drift.IEEE Robotics and Automation Letters, 3(4):3426–3433, 2018

Te Tang, Changhao Wang, and Masayoshi Tomizuka. A framework for manipulating deformable linear ob- jects by coherent point drift.IEEE Robotics and Automation Letters, 3(4):3426–3433, 2018. doi: 10. 1109/LRA.2018.2852770. URL https://ieeexplore.ieee. org/document/8403315

work page arXiv 2018
[34]

Drake: Model-based design and verification for robotics, 2019

Russ Tedrake and the Drake Development Team. Drake: Model-based design and verification for robotics, 2019. URL https://drake.mit.edu

work page 2019
[35]

van Oosten, O.H

C.L. van Oosten, O.H. Bosgra, and B.G. Dijkstra. Reduc- ing residual vibrations through iterative learning control, with application to a wafer stage. InProceedings of the 2004 American Control Conference, volume 6, pages 5150–5155 vol.6, 2004. doi: 10.23919/ACC.2004. 1384669

work page doi:10.23919/acc.2004 2004
[36]

An- drew Bagnell

Anirudh Vemula, Wen Sun, Maxim Likhachev, and J. An- drew Bagnell. On the effectiveness of iterative learning control. InProceedings of The 4th Annual Learning for Dynamics and Control Conference, pages 47–58. PMLR. URL https://proceedings.mlr.press/v168/vemula22a.html

work page
[37]

Self-supervised learning of dy- namic planar manipulation of free-end cables

Jonathan Wang, Huang Huang, Vincent Lim, Harry Zhang, Jeffrey Ichnowski, Daniel Seita, Yunliang Chen, and Ken Goldberg. Self-supervised learning of dy- namic planar manipulation of free-end cables. URL http://arxiv.org/abs/2405.09581

work page arXiv
[38]

and Shah, Kunal and Miller, Lee E and Cotton, R

Junyi Wang and Xiaofeng Xiong. A learning-based control framework for human-like whip targeting. In 2024 10th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob), pages 550–555. doi: 10.1109/BioRob60516.2024. 10719935. URL https://ieeexplore.ieee.org/document/ 10719935/. ISSN: 2155-1782

work page doi:10.1109/biorob60516.2024 2024
[39]

Multi-robot task planning under individual and collaborative temporal logic specifications

Xiaofeng Xiong, Moses C. Nah, Aleksei Krotov, and Dagmar Sternad. Online impedance adaptation fa- cilitates manipulating a whip. In2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 9297–9302. doi: 10.1109/ IROS51168.2021.9636663. URL https://ieeexplore.ieee. org/document/9636663

work page arXiv 2021
[40]

Motion planning for dynamic knotting of a flexible rope with a high-speed robot arm

Yuji Yamakawa, Akio Namiki, and Masatoshi Ishikawa. Motion planning for dynamic knotting of a flexible rope with a high-speed robot arm. In2010 IEEE/RSJ Inter- national Conference on Intelligent Robots and Systems, pages 49–54, 2010. doi: 10.1109/IROS.2010.5651168

work page doi:10.1109/iros.2010.5651168 2010
[41]

Motion planning for dynamic folding of a cloth with two high-speed robot hands and two high-speed sliders

Yuji Yamakawa, Akio Namiki, and Masatoshi Ishikawa. Motion planning for dynamic folding of a cloth with two high-speed robot hands and two high-speed sliders. In 2011 IEEE International Conference on Robotics and Automation, pages 5486–5491, 2011. doi: 10.1109/ ICRA.2011.5979606

work page arXiv 2011
[42]

Viser: Imperative, web-based 3d visualization in python.arXiv preprint arXiv:2507.22885, 2025

Brent Yi, Chung Min Kim, Justin Kerr, Gina Wu, Re- becca Feng, Anthony Zhang, Jonas Kulhanek, Hongsuk Choi, Yi Ma, Matthew Tancik, and Angjoo Kanazawa. Viser: Imperative, web-based 3d visualization in python. arXiv preprint arXiv:2507.22885, 2025

work page arXiv 2025
[43]

Autonomous robotic la- paroscopic surgery for intestinal anastomosis.Science Robotics, 7(62):eabj2908, 2022

Hang Yin, Anastasia Varava, and Danica Kragic. Mod- eling, learning, perception, and control methods for deformable object manipulation.Science Robotics, 6(54):eabd8803, 2021. doi: 10.1126/scirobotics. abd8803. URL https://www.science.org/doi/abs/10.1126/ scirobotics.abd8803

work page doi:10.1126/scirobotics 2021
[44]

Robots of the lost arc: Self-supervised learning to dynamically manipulate fixed-endpoint cables

Harry Zhang, Jeffrey Ichnowski, Daniel Seita, Jonathan Wang, Huang Huang, and Ken Goldberg. Robots of the lost arc: Self-supervised learning to dynamically manipulate fixed-endpoint cables. URL http://arxiv.org/ abs/2011.04840. TABLE III INVERSEMODELPARAMETERS Parameter Value wcontrol 0.5 wcritical pos 25 wcritical vel 0.00375 wpc 100 wvc 0.1 wRc 5 wpf t ...

work page arXiv 2011
[45]

Inverse Model Parameters:We use the shorthand ∥a∥2 W :=a ⊤Wa. The critical-point objective weights the rope-marker position and velocity errors att c with a diagonal matrix Q:= diag(w critical posI3N , w critical velI3N), whereNis the number of rope markers (links). In general,w is a diagonal cost element for a cost matrix. The control-update regularizer ...

work page
[46]

Lete f t(t;u(t))denote the end-effector error vector defined in Section F

QP Hand Tracking Objective:Fort∈[t c, T], we encourage the robot to match the demonstrator’s follow- through motion by penalizing the end-effector tracking error. Lete f t(t;u(t))denote the end-effector error vector defined in Section F. This error depends nonlinearly on the command through the Bezier spline and forward kinematics. We linearize ef t about...

work page
[47]

Select a temporary range ˜t0 and ˜tf that excludes noisy data from the human picking up and placing the rope on the floor

work page
[48]

Search for maximum hand velocity in the range[ ˜t0, ˜tf] and label ast peak

work page
[49]

Step backwards in time fromt peak and update ˜t0 to the first point when the hand velocity is near-zero (3% of velocity att peak)

work page
[50]

Sett f =t c+35ms as a fixed follow through time length

work page
[51]

h(t)is then defined as 3D pose trajectory of the hand between t0 andt f .x demo(t)is the rope trajectory fromt 0 tot c

Integrate along the hand path motion between ˜t0 andt f , then sett 0 to the time when 5% of the total path length is traveled. h(t)is then defined as 3D pose trajectory of the hand between t0 andt f .x demo(t)is the rope trajectory fromt 0 tot c. Each demonstration type has a different execution speed and overall time length. TABLE V DEMONSTRATIONTRACKIN...

work page

[1] [1]

Using inaccurate models in reinforcement learning

Pieter Abbeel, Morgan Quigley, and Andrew Y Ng. Using inaccurate models in reinforcement learning. In Proceedings of the 23rd International Conference on Machine Learning, pages 1–8, 2006

work page 2006

[2] [2]

Potential ﬁeld methods and their inherent limitations for mobile robot navigation,

E.W. Aboaf, C.G. Atkeson, and D.J. Reinkensmeyer. Task-level robot learning. InProceedings. 1988 IEEE International Conference on Robotics and Automation, pages 1309–1310 vol.2, 1988. doi: 10.1109/ROBOT. 1988.12245

work page doi:10.1109/robot 1988

[3] [3]

A regret minimization approach to iterative learning control

Naman Agarwal, Elad Hazan, Anirudha Majumdar, and Karan Singh. A regret minimization approach to iterative learning control. InProceedings of the 38th Interna- tional Conference on Machine Learning, pages 100–

work page

[4] [4]

URL https://proceedings.mlr.press/v139/ agarwal21b.html

PMLR. URL https://proceedings.mlr.press/v139/ agarwal21b.html

work page

[5] [5]

An, Christopher G

Chae H. An, Christopher G. Atkeson, and John Holler- bach.Model-Based Control of a Robot Manipulator. Artificial Intelligence Series. MIT Press. ISBN 978-0- 262-51157-5

work page

[6] [6]

Bettering operation of robots by learning

Suguru Arimoto, Sadao Kawamura, and Fumio Miyazaki. Bettering operation of robots by learning. Journal of Robotic Systems, 1(2):123–140, 1984

work page 1984

[7] [7]

Atkeson and J

C. Atkeson and J. McIntyre. Robot trajectory learning through practice. In1986 IEEE International Con- ference on Robotics and Automation Proceedings, vol- ume 3, pages 1737–1742. URL https://ieeexplore.ieee. org/document/1087423

work page arXiv

[8] [8]

Bristow, M

D.A. Bristow, M. Tharayil, and A.G. Alleyne. A survey of iterative learning control.IEEE Control Systems Magazine, 26(3):96–114, 2006. doi: 10.1109/MCS.2006. 1636313

work page doi:10.1109/mcs.2006 2006

[9] [9]

Linear-time vari- ational integrators in maximal coordinates

Jan Br ¨udigam and Zachary Manchester. Linear-time vari- ational integrators in maximal coordinates. In Steven M. LaValle, Ming Lin, Timo Ojala, Dylan Shell, and Jingjin Yu, editors,Algorithmic Foundations of Robotics XIV, volume 17, pages 194–209. Springer International Pub- lishing. doi: 10.1007/978-3-030-66723-8 12. URL http: //link.springer.com/10.100...

work page doi:10.1007/978-3-030-66723-8

[10] [10]

Efficiently learning single-arm fling motions to smooth garments

Lawrence Yunliang Chen, Huang Huang, Ellen Novoseller, Daniel Seita, Jeffrey Ichnowski, Michael Laskey, Richard Cheng, Thomas Kollar, and Ken Goldberg. Efficiently learning single-arm fling motions to smooth garments. URL http://arxiv.org/abs/2206.08921

work page arXiv

[11] [11]

Yiyang Chen, Bing Chu, and Christopher T. Freeman. Point-to-point iterative learning control with optimal tracking time allocation.IEEE Transactions on Con- trol Systems Technology, 26(5):1685–1698, 2018. doi: 10.1109/TCST.2017.2735358

work page doi:10.1109/tcst.2017.2735358 2018

[12] [12]

Iterative residual policy: for goal-conditioned dynamic manipulation of deformable objects

Cheng Chi, Benjamin Burchfiel, Eric Cousineau, Siyuan Feng, and Shuran Song. Iterative residual policy: for goal-conditioned dynamic manipulation of deformable objects. URL http://arxiv.org/abs/2203.00663

work page arXiv

[13] [13]

Adaptive terminal ilc for iteration- varying target points.Asian Journal of Control, 17(3): 952–962, 2015

Ronghu Chi, Danwei Wang, Frank L Lewis, Zhongsheng Hou, and Shangtai Jin. Adaptive terminal ilc for iteration- varying target points.Asian Journal of Control, 17(3): 952–962, 2015

work page 2015

[14] [14]

Gill, Walter Murray, and Michael A

Philip E. Gill, Walter Murray, and Michael A. Saunders. Snopt: An sqp algorithm for large-scale constrained optimization.SIAM Journal on Optimization, 12(4):979– 1006, 2002. doi: 10.1137/S1052623499350013. URL https://doi.org/10.1137/S1052623499350013

work page doi:10.1137/s1052623499350013 2002

[15] [15]

Goulart and Yuwen Chen

Paul J. Goulart and Yuwen Chen. Clarabel: An interior- point solver for conic programs with quadratic objectives,

work page

[16] [16]

URL https://arxiv.org/abs/2405.12762

work page arXiv

[17] [17]

Zico Kolter, and Zachary Manchester

Swaminathan Gurumurthy, J. Zico Kolter, and Zachary Manchester. Deep off-policy iterative learning control. In Proceedings of The 5th Annual Learning for Dynamics and Control Conference, pages 639–652. PMLR. URL https://proceedings.mlr.press/v211/gurumurthy23b.html

work page

[18] [18]

Flingbot: The unreasonable effectiveness of dynamic manipulation for cloth unfold- ing, 2021

Huy Ha and Shuran Song. Flingbot: The unreasonable effectiveness of dynamic manipulation for cloth unfold- ing, 2021. URL https://arxiv.org/abs/2105.03655

work page arXiv 2021

[19] [19]

Dynamic manipulation of deformable objects using imitation learning with adaptation to hard- ware constraints

Eric Hannus, Tran Nguyen Le, David Blanco-Mulero, and Ville Kyrki. Dynamic manipulation of deformable objects using imitation learning with adaptation to hard- ware constraints. In2024 IEEE/RSJ International Con- ference on Intelligent Robots and Systems (IROS), pages 12655–12662. IEEE, 2024

work page 2024

[20] [20]

RaC: Robot learning for long-horizon tasks by scaling recovery and correction

Zheyuan Hu, Robyn Wu, Naveen Enock, Jasmine Li, Riya Kadakia, Zackory Erickson, and Aviral Kumar. RaC: Robot learning for long-horizon tasks by scaling recovery and correction. URL http://arxiv.org/abs/2509. 07953

work page

[21] [21]

Self- supervised cloth reconstruction via action-conditioned cloth tracking

Zixuan Huang, Xingyu Lin, and David Held. Self- supervised cloth reconstruction via action-conditioned cloth tracking. InIEEE International Conference on Robotics and Automation (ICRA), 2023, 2023

work page 2023

[22] [22]

Tae-Yong Kuc, Kwanghee Nam, and J.S. Lee. An iterative learning control of robot manipulators.IEEE Transactions on Robotics and Automation, 7(6):835–842,

work page

[23] [23]

doi: 10.1109/70.105392

work page doi:10.1109/70.105392

[24] [24]

Varadarajan, A

Vincent Lim, Huang Huang, Lawrence Yunliang Chen, Jonathan Wang, Jeffrey Ichnowski, Daniel Seita, Michael Laskey, and Ken Goldberg. Real2sim2real: Self- supervised learning of physical single-step dynamic ac- tions for planar robot casting. In2022 International Conference on Robotics and Automation (ICRA), pages 8282–8289. doi: 10.1109/ICRA46639.2022.9811...

work page doi:10.1109/icra46639.2022.9811651 2022

[25] [25]

Event-triggered nonlinear iterative learning control.IEEE Transactions on Neural Networks and Learning Systems, 32(11):5118–5128, 2020

Na Lin, Ronghu Chi, Biao Huang, and Zhongsheng Hou. Event-triggered nonlinear iterative learning control.IEEE Transactions on Neural Networks and Learning Systems, 32(11):5118–5128, 2020

work page 2020

[26] [26]

Iterative learning control: A survey and new results.Journal of Robotic Systems, 9(5):563–594, 1992

Kevin L Moore, Mohammed Dahleh, and SP Bhat- tacharyya. Iterative learning control: A survey and new results.Journal of Robotic Systems, 9(5):563–594, 1992

work page 1992

[27] [27]

Nah, Aleksei Krotov, Marta Russo, Dagmar Sternad, and Neville Hogan

Moses C. Nah, Aleksei Krotov, Marta Russo, Dagmar Sternad, and Neville Hogan. Dynamic primitives facili- tate manipulating a whip. In2020 8th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob), pages 685–691, Nov 2020. doi: 10.1109/BioRob49111.2020.9224399

work page doi:10.1109/biorob49111.2020.9224399 2020

[28] [28]

Com- bining self-supervised learning and imitation for vision- based rope manipulation

Ashvin Nair, Dian Chen, Pulkit Agrawal, Phillip Isola, Pieter Abbeel, Jitendra Malik, and Sergey Levine. Com- bining self-supervised learning and imitation for vision- based rope manipulation. In2017 IEEE International Conference on Robotics and Automation (ICRA), pages 2146–2153, 2017. doi: 10.1109/ICRA.2017.7989247

work page doi:10.1109/icra.2017.7989247 2017

[29] [29]

Ratcliffe, Paul L

James D. Ratcliffe, Paul L. Lewin, Eric Rogers, Jari J. Hatonen, and David H. Owens. Norm-optimal iterative learning control applied to gantry robots for automation applications.IEEE Transactions on Robotics, 22(6): 1303–1307, 2006. doi: 10.1109/TRO.2006.882927

work page doi:10.1109/tro.2006.882927 2006

[30] [30]

Sukhatme

Gautam Salhotra, I-Chun Arthur Liu, Marcus Dominguez-Kuhne, and Gaurav S. Sukhatme. Learning deformable object manipulation from expert demonstrations.IEEE Robotics and Automation Letters, 7(4):8775–8782, 2022. doi: 10.1109/LRA.2022.3187843

work page doi:10.1109/lra.2022.3187843 2022

[31] [31]

Schoellig, Fabian L

Angela P. Schoellig, Fabian L. Mueller, and Raffaello D’Andrea. Optimization-based iterative learning for precise quadrocopter trajectory tracking. 33(1):103–127. doi: 10.1007/s10514-012-9283-2. URL https://doi.org/ 10.1007/s10514-012-9283-2

work page doi:10.1007/s10514-012-9283-2

[32] [32]

Optimization- based iterative learning control for trajectory tracking

Angela Sch ¨ollig and Raffaello D’Andrea. Optimization- based iterative learning control for trajectory tracking. In 2009 European Control Conference (ECC), pages 1505– 1510, 2009. doi: 10.23919/ECC.2009.7074619

work page doi:10.23919/ecc.2009.7074619 2009

[33] [33]

A framework for manipulating deformable linear ob- jects by coherent point drift.IEEE Robotics and Automation Letters, 3(4):3426–3433, 2018

Te Tang, Changhao Wang, and Masayoshi Tomizuka. A framework for manipulating deformable linear ob- jects by coherent point drift.IEEE Robotics and Automation Letters, 3(4):3426–3433, 2018. doi: 10. 1109/LRA.2018.2852770. URL https://ieeexplore.ieee. org/document/8403315

work page arXiv 2018

[34] [34]

Drake: Model-based design and verification for robotics, 2019

Russ Tedrake and the Drake Development Team. Drake: Model-based design and verification for robotics, 2019. URL https://drake.mit.edu

work page 2019

[35] [35]

van Oosten, O.H

C.L. van Oosten, O.H. Bosgra, and B.G. Dijkstra. Reduc- ing residual vibrations through iterative learning control, with application to a wafer stage. InProceedings of the 2004 American Control Conference, volume 6, pages 5150–5155 vol.6, 2004. doi: 10.23919/ACC.2004. 1384669

work page doi:10.23919/acc.2004 2004

[36] [36]

An- drew Bagnell

Anirudh Vemula, Wen Sun, Maxim Likhachev, and J. An- drew Bagnell. On the effectiveness of iterative learning control. InProceedings of The 4th Annual Learning for Dynamics and Control Conference, pages 47–58. PMLR. URL https://proceedings.mlr.press/v168/vemula22a.html

work page

[37] [37]

Self-supervised learning of dy- namic planar manipulation of free-end cables

Jonathan Wang, Huang Huang, Vincent Lim, Harry Zhang, Jeffrey Ichnowski, Daniel Seita, Yunliang Chen, and Ken Goldberg. Self-supervised learning of dy- namic planar manipulation of free-end cables. URL http://arxiv.org/abs/2405.09581

work page arXiv

[38] [38]

and Shah, Kunal and Miller, Lee E and Cotton, R

Junyi Wang and Xiaofeng Xiong. A learning-based control framework for human-like whip targeting. In 2024 10th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob), pages 550–555. doi: 10.1109/BioRob60516.2024. 10719935. URL https://ieeexplore.ieee.org/document/ 10719935/. ISSN: 2155-1782

work page doi:10.1109/biorob60516.2024 2024

[39] [39]

Multi-robot task planning under individual and collaborative temporal logic specifications

Xiaofeng Xiong, Moses C. Nah, Aleksei Krotov, and Dagmar Sternad. Online impedance adaptation fa- cilitates manipulating a whip. In2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 9297–9302. doi: 10.1109/ IROS51168.2021.9636663. URL https://ieeexplore.ieee. org/document/9636663

work page arXiv 2021

[40] [40]

Motion planning for dynamic knotting of a flexible rope with a high-speed robot arm

Yuji Yamakawa, Akio Namiki, and Masatoshi Ishikawa. Motion planning for dynamic knotting of a flexible rope with a high-speed robot arm. In2010 IEEE/RSJ Inter- national Conference on Intelligent Robots and Systems, pages 49–54, 2010. doi: 10.1109/IROS.2010.5651168

work page doi:10.1109/iros.2010.5651168 2010

[41] [41]

Motion planning for dynamic folding of a cloth with two high-speed robot hands and two high-speed sliders

Yuji Yamakawa, Akio Namiki, and Masatoshi Ishikawa. Motion planning for dynamic folding of a cloth with two high-speed robot hands and two high-speed sliders. In 2011 IEEE International Conference on Robotics and Automation, pages 5486–5491, 2011. doi: 10.1109/ ICRA.2011.5979606

work page arXiv 2011

[42] [42]

Viser: Imperative, web-based 3d visualization in python.arXiv preprint arXiv:2507.22885, 2025

Brent Yi, Chung Min Kim, Justin Kerr, Gina Wu, Re- becca Feng, Anthony Zhang, Jonas Kulhanek, Hongsuk Choi, Yi Ma, Matthew Tancik, and Angjoo Kanazawa. Viser: Imperative, web-based 3d visualization in python. arXiv preprint arXiv:2507.22885, 2025

work page arXiv 2025

[43] [43]

Autonomous robotic la- paroscopic surgery for intestinal anastomosis.Science Robotics, 7(62):eabj2908, 2022

Hang Yin, Anastasia Varava, and Danica Kragic. Mod- eling, learning, perception, and control methods for deformable object manipulation.Science Robotics, 6(54):eabd8803, 2021. doi: 10.1126/scirobotics. abd8803. URL https://www.science.org/doi/abs/10.1126/ scirobotics.abd8803

work page doi:10.1126/scirobotics 2021

[44] [44]

Robots of the lost arc: Self-supervised learning to dynamically manipulate fixed-endpoint cables

Harry Zhang, Jeffrey Ichnowski, Daniel Seita, Jonathan Wang, Huang Huang, and Ken Goldberg. Robots of the lost arc: Self-supervised learning to dynamically manipulate fixed-endpoint cables. URL http://arxiv.org/ abs/2011.04840. TABLE III INVERSEMODELPARAMETERS Parameter Value wcontrol 0.5 wcritical pos 25 wcritical vel 0.00375 wpc 100 wvc 0.1 wRc 5 wpf t ...

work page arXiv 2011

[45] [45]

Inverse Model Parameters:We use the shorthand ∥a∥2 W :=a ⊤Wa. The critical-point objective weights the rope-marker position and velocity errors att c with a diagonal matrix Q:= diag(w critical posI3N , w critical velI3N), whereNis the number of rope markers (links). In general,w is a diagonal cost element for a cost matrix. The control-update regularizer ...

work page

[46] [46]

Lete f t(t;u(t))denote the end-effector error vector defined in Section F

QP Hand Tracking Objective:Fort∈[t c, T], we encourage the robot to match the demonstrator’s follow- through motion by penalizing the end-effector tracking error. Lete f t(t;u(t))denote the end-effector error vector defined in Section F. This error depends nonlinearly on the command through the Bezier spline and forward kinematics. We linearize ef t about...

work page

[47] [47]

Select a temporary range ˜t0 and ˜tf that excludes noisy data from the human picking up and placing the rope on the floor

work page

[48] [48]

Search for maximum hand velocity in the range[ ˜t0, ˜tf] and label ast peak

work page

[49] [49]

Step backwards in time fromt peak and update ˜t0 to the first point when the hand velocity is near-zero (3% of velocity att peak)

work page

[50] [50]

Sett f =t c+35ms as a fixed follow through time length

work page

[51] [51]

h(t)is then defined as 3D pose trajectory of the hand between t0 andt f .x demo(t)is the rope trajectory fromt 0 tot c

Integrate along the hand path motion between ˜t0 andt f , then sett 0 to the time when 5% of the total path length is traveled. h(t)is then defined as 3D pose trajectory of the hand between t0 andt f .x demo(t)is the rope trajectory fromt 0 tot c. Each demonstration type has a different execution speed and overall time length. TABLE V DEMONSTRATIONTRACKIN...

work page