pith. sign in

arxiv: 2606.18097 · v1 · pith:PVNBDGNSnew · submitted 2026-06-16 · 💻 cs.RO

WireCraft: A Simulation Benchmark for Industrial DLO Manipulation

Pith reviewed 2026-06-27 00:29 UTC · model grok-4.3

classification 💻 cs.RO
keywords deformable linear objectswire manipulationsimulation benchmarkreinforcement learningimitation learningvision-language-actionindustrial assemblycontact-rich manipulation
0
0 comments X

The pith

WireCraft benchmark shows privileged RL succeeds on industrial wire tasks while vision policies struggle with contact-rich alignment.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces WireCraft as a simulation benchmark for industrial manipulation of deformable linear objects such as wires and cables. It creates three task families—connector insertion, clip routing, and channel seating—along with two physics models and trajectories from both simulation and a physical robot. Benchmarking reinforcement learning, imitation learning, and vision-language-action policies reveals that privileged state-based RL reaches over 82 percent success in representative settings for each family. This establishes that the tasks are well-posed under full state access, yet vision-based methods encounter persistent difficulties during the shift from reaching to precise contact alignment, especially in connector insertion. The result matters because DLOs appear throughout assembly work and current learning approaches lack a shared, industrially relevant testbed for progress.

Core claim

The authors present WireCraft, a configurable simulation benchmark spanning connector insertion, clip routing, and channel seating tasks for deformable linear objects. Using trajectories from simulation and a physical UR5 robot, they benchmark multiple policy types and find that privileged state-based reinforcement learning solves representative settings in each family with over 82% success, while the transition to contact-rich alignment in connector insertion remains difficult for vision RL, IL, and VLA policies, indicating that industrial DLO manipulation is tractable under privileged information but an open challenge for vision-based approaches.

What carries the argument

The WireCraft simulation benchmark with its three task families, articulated and deformable DLO physics models, configurable assets, and shared evaluation metrics across RL, IL, and VLA policies.

If this is right

  • Privileged state information makes each task family solvable by current reinforcement learning methods.
  • The shift from free-space reaching to contact-rich alignment in connector insertion is the primary remaining difficulty for vision-based approaches.
  • The benchmark supplies modular, customizable assets that support controlled variation in task difficulty.
  • Standardized metrics across RL, IL, and VLA allow direct comparison of policy families on the same tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Better handling of contact transitions in vision models could narrow the performance gap to privileged-state methods.
  • Extending the benchmark to include more sequential multi-step assemblies would test generalization beyond the current three families.
  • Real-world validation on physical hardware would test whether the privileged success rates translate outside simulation.

Load-bearing premise

The simulation environments and task definitions accurately capture the dynamics and contact behaviors of real industrial DLO manipulation.

What would settle it

Demonstrating a vision-based policy that reaches over 80 percent success on the connector insertion task under the same simulation conditions would falsify the claim that contact-rich alignment remains a key bottleneck.

Figures

Figures reproduced from arXiv: 2606.18097 by Artem Arutyunov, Chi-Guhn Lee, Chongyu Zhu, Hyegang Kim, Jiachen Rao, Ramy ElMallah, Seungyeon Ha, Zachary Tang.

Figure 1
Figure 1. Figure 1: WireCraft simulates industrial DLO manipulation across three task families: connector insertion, clip routing, and channel seating with trajectories from multiple sources. Tasks run on the UR5, Franka, and Trossen Stationary AI in simulation, with real-world validation on a UR5. General robot learning benchmarks such as RLBench [12] and ManiSkill3 [13] are not centered on DLO manipulation and provide limit… view at source ↗
Figure 2
Figure 2. Figure 2: WireCraft Architecture, including the WireCraft simulation engine, asset library, task suite, demonstration sources, policy baselines, and real-world validation setup. WireCraft supports articulated and FEM-based deformable wire models, provides configurable industrial DLO manip￾ulation tasks across connector insertion, clip routing, and channel seating, and pairs simulation data with real-world teleoperat… view at source ↗
Figure 3
Figure 3. Figure 3: WireCraft Assets. Top: the two configurable wire representations, articulated and FEM￾based deformable, shown as rendered DLOs (left) and their construction-level chain/mesh structure (right). Bottom: a subset of representative 3D-printable connectors and task fixtures. from primitives such as cylinders and cuboids. They differ in rotational symmetry, allowing task difficulty to range from rotation-invaria… view at source ↗
Figure 4
Figure 4. Figure 4: Ethernet Connector Insertion performance. SRreach and SRinsert on the shared simu￾lated insertion task show privileged RL solves the task reliably, while vision-based baselines exhibit a substantial gap between reaching the socket neighborhood and completing contact-rich insertion [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: DLO–Connector Physics Demonstration. Visualization of articulated and deformable wire concatenated with a DisplayPort port across representative simulation frames. 14 [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Scaling behavior of the two WireCraft wire models. Both models were evaluated in matched headless DLO-only scenes with a 1/120 s physics timestep, 5 s warmup, 60 s measured rollout. Absolute throughput depends on discretization and solver settings; the comparison is in￾tended to show scaling behavior of the selected benchmark configurations. The articulation model scales to 8192 successful environments, wh… view at source ↗
Figure 7
Figure 7. Figure 7: Scripted demonstration rollouts. Each row shows a scripted policy executing one task family across representative phases, progressing from the initial state through grasp and manipula￾tion to task completion. E RL Experiment Detail E.1 State-based RL We detail the privileged RL configuration on connector insertion, our primary RL setting. On inser￾tion, the two privileged baselines, PPO and SACfD, share th… view at source ↗
Figure 8
Figure 8. Figure 8: RL rollout demonstrations. Each row shows a trained State PPO policy rolled out on one task family across representative phases. The clip-routing row additionally illustrates an error and subsequent recovery. 22 [PITH_FULL_IMAGE:figures/full_fig_p022_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Real vs. noise-augmented simulation observations. We add sensor-style noise to the simulated RGB streams to narrow the appearance gap with the physical UR5 cameras. Each column pairs a real camera view with its noise-augmented simulated counterpart for the side and wrist views [PITH_FULL_IMAGE:figures/full_fig_p025_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Representative real-world failure modes on Ethernet insertion. (a) The policy grasps the connector but drives downward into the task board instead of lifting, triggering a protective stop. (b) The plug reaches the socket neighborhood but is not rotationally aligned, so it cannot be seated. (c) The gripper closes too far back, grasping the cable rather than the connector body. (d) The gripper closes too fa… view at source ↗
Figure 11
Figure 11. Figure 11: A teleoperated connector-insertion trajectory in simulation. A human operator drives the UR5 end-effector through eight representative stages: (1) initialization with the plug resting on the table; (2) approaching the plug; (3) aligning the gripper above it; (4) grasping the connector, typically after several attempts, since the operator must re-position and re-close the gripper to secure a firm grasp; (5… view at source ↗
read the original abstract

Deformable Linear Objects (DLOs), such as wires and cables, are central to industrial assembly. Unlike rigid objects, whose state is captured by a 6-DoF pose, DLOs have an infinite-dimensional configuration space and deform continuously under contact with grippers, fixtures, and the workspace, making them a demanding benchmark for general dexterous manipulation. Despite their importance, policy development and comparison remain difficult: existing benchmarks are often tied to specific hardware setups, lack modular and customizable task assets, or study generic deformable-object tasks without the fixtures relevant to real-world industrial wire manipulation. Few benchmarks align simulation, real-world data, and shared evaluation protocols. To bridge this gap, we introduce WireCraft, a simulation benchmark for industrial DLO manipulation with configurable difficulty and assets, spanning three task families: connector insertion, clip routing, and channel seating. It supports two complementary DLO physics models, articulated and deformable, and the trajectories come from both simulation and a physical UR5. We benchmark reinforcement learning (RL), imitation learning (IL), and vision-language-action (VLA) policies under shared metrics. Privileged state-based RL solves a representative setting in each task family with over 82\% success, confirming the tasks are well-posed. For connector insertion, however, the transition from reaching the socket to contact-rich alignment remains a key bottleneck for vision RL, IL, and VLA policies. These results indicate that industrial DLO manipulation, though tractable under privileged state, remains an open challenge for current vision-based learning. The benchmark, data, and tools will be open-sourced upon acceptance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces WireCraft, a simulation benchmark for industrial Deformable Linear Object (DLO) manipulation spanning connector insertion, clip routing, and channel seating. It supports articulated and deformable DLO physics models, provides trajectories from both simulation and a physical UR5, and evaluates RL, IL, and VLA policies under shared metrics. The central empirical claim is that privileged state-based RL achieves over 82% success on representative settings in each task family (confirming the tasks are well-posed), while vision-based methods encounter a key bottleneck in the transition to contact-rich alignment for connector insertion.

Significance. If the simulation environments faithfully reproduce real contact dynamics, the benchmark would provide a valuable, modular platform for comparing learning methods on industrially relevant DLO tasks with configurable difficulty, shared evaluation protocols, and both simulation and real trajectories. The explicit promise to open-source the benchmark, data, and tools upon acceptance is a concrete strength that would enable reproducibility and community use.

major comments (2)
  1. [Abstract] Abstract: The claim that privileged state-based RL success rates >82% confirm the tasks are well-posed rests on the assumption that the two DLO physics models accurately reproduce real industrial contact forces, friction, and deformation; however, the manuscript provides no quantitative sim-to-real metrics (e.g., force-torque residuals or configuration error distributions) for the contact-rich phases identified as bottlenecks, despite noting trajectories collected on a physical UR5.
  2. [Abstract] Abstract: Performance numbers are stated without specifying simulation fidelity parameters, data exclusion rules, or exact evaluation protocols (success criteria, episode lengths, randomization ranges), which prevents independent verification that the reported privileged RL results demonstrate the tasks are solvable in principle rather than artifacts of the chosen simulator settings.
minor comments (1)
  1. [Abstract] The abstract would benefit from reporting per-task-family success rates rather than the aggregate 'over 82%' figure to allow readers to assess variability across connector insertion, clip routing, and channel seating.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive comments on the abstract. We address each point below and indicate planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that privileged state-based RL success rates >82% confirm the tasks are well-posed rests on the assumption that the two DLO physics models accurately reproduce real industrial contact forces, friction, and deformation; however, the manuscript provides no quantitative sim-to-real metrics (e.g., force-torque residuals or configuration error distributions) for the contact-rich phases identified as bottlenecks, despite noting trajectories collected on a physical UR5.

    Authors: We agree that the lack of quantitative sim-to-real metrics (such as force-torque residuals or configuration errors) for contact-rich phases weakens the claim that the >82% success rates confirm the tasks are well-posed for real industrial use. Real UR5 trajectories were collected to enable future transfer studies and qualitative checks, but no such quantitative analysis was performed. In revision we will modify the abstract to state that the success rates demonstrate solvability within the provided simulation environments and will add an explicit limitations paragraph on the sim-to-real gap. We cannot supply the requested metrics because they were not computed. revision: partial

  2. Referee: [Abstract] Abstract: Performance numbers are stated without specifying simulation fidelity parameters, data exclusion rules, or exact evaluation protocols (success criteria, episode lengths, randomization ranges), which prevents independent verification that the reported privileged RL results demonstrate the tasks are solvable in principle rather than artifacts of the chosen simulator settings.

    Authors: The full experimental protocols, including simulation fidelity settings, success criteria, episode lengths, randomization ranges, and any data exclusion rules, are detailed in Section 4 and the supplementary material. We will revise the abstract to include a direct reference or footnote to these sections so that the reported numbers can be independently verified against the stated protocols. revision: yes

standing simulated objections not resolved
  • Quantitative sim-to-real metrics for contact-rich phases

Circularity Check

0 steps flagged

No circularity; empirical benchmark with no derivation chain

full rationale

The paper introduces WireCraft as a simulation benchmark for DLO tasks and reports empirical success rates (>82% for privileged RL) to support that tasks are well-posed. This is a direct experimental claim with no equations, fitted parameters renamed as predictions, self-definitional steps, or load-bearing self-citations. The work is self-contained as a benchmark paper; the sim-to-real assumption is an external validity issue, not a reduction of any derivation to its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, mathematical axioms, or invented physical entities are introduced; the contribution consists of new task assets and evaluation protocols whose validity depends on unstated simulation fidelity assumptions.

pith-pipeline@v0.9.1-grok · 5850 in / 1150 out tokens · 29477 ms · 2026-06-27T00:29:21.487578+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

58 extracted references · 10 canonical work pages

  1. [1]

    J. Zhu, A. Cherubini, C. Dune, D. Navarro-Alarcon, F. Alambeigi, D. Berenson, F. Ficuciello, K. Harada, J. Kober, X. Li, et al. Challenges and outlook in robotic manipulation of deformable objects.IEEE Robotics & Automation Magazine, 29(3):67–77, 2022

  2. [2]

    Zhang, H.-C

    X. Zhang, H.-C. Lin, Y . Zhao, and M. Tomizuka. Harnessing with twisting: Single-arm de- formable linear object manipulation for industrial harnessing task. In2024 IEEE/RSJ Interna- tional Conference on Intelligent Robots and Systems (IROS), pages 4069–4075. IEEE, 2024

  3. [3]

    Wilson, H

    A. Wilson, H. Jiang, W. Lian, and W. Yuan. Cable routing and assembly using tactile- driven motion primitives. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 10408–10414. IEEE, 2023

  4. [4]

    J. Luo, C. Xu, X. Geng, G. Feng, K. Fang, L. Tan, S. Schaal, and S. Levine. Multistage cable routing through hierarchical imitation learning.IEEE Transactions on Robotics, 40: 1476–1491, 2024

  5. [5]

    S. Jin, W. Lian, C. Wang, M. Tomizuka, and S. Schaal. Robotic cable routing with spatial representation.IEEE Robotics and Automation Letters, 7(2):5687–5694, 2022. doi:10.1109/ LRA.2022.3158377

  6. [6]

    M. Yu, H. Zhong, and X. Li. Shape control of deformable linear objects with offline and online learning of local linear deformation models. In2022 International Conference on Robotics and Automation (ICRA), pages 1337–1343. IEEE, 2022

  7. [7]

    F. Gu, Z. Wang, Z. Zhu, J. Ma, Y . Zhou, S. Jiang, and B. He. A survey on robotic manipulation of deformable objects: Recent advances, open challenges and new frontiers.Neurocomputing, page 134058, 2026

  8. [8]

    Zhang, K

    W. Zhang, K. Schmeckpeper, P. Chaudhari, and K. Daniilidis. Deformable linear object predic- tion using locally linear latent dynamics. In2021 IEEE International Conference on Robotics and Automation (ICRA), pages 13503–13509. IEEE, 2021

  9. [9]

    Open X-Embodiment: Robotic learning datasets and RT-X models

    M. Li and C. Choi. Learning for deformable linear object insertion leveraging flexibility esti- mation from visual cues. In2024 IEEE International Conference on Robotics and Automation (ICRA), page 5183–5189. IEEE, May 2024. doi:10.1109/icra57147.2024.10610419. URL http://dx.doi.org/10.1109/ICRA57147.2024.10610419

  10. [10]

    B. Cao, X. Zang, X. Zhang, Z. Chen, S. Li, and J. Zhao. Shape control of elastic deformable linear objects for robotic cable assembly.Advanced Intelligent Systems, 6(7):2300835, 2024

  11. [11]

    M. Yan, Y . Zhu, N. Jin, and J. Bohg. Self-supervised learning of state estimation for ma- nipulating deformable linear objects.IEEE robotics and automation letters, 5(2):2372–2379, 2020

  12. [12]

    James, Z

    S. James, Z. Ma, D. R. Arrojo, and A. J. Davison. Rlbench: The robot learning benchmark & learning environment.IEEE Robotics and Automation Letters, 5(2):3019–3026, 2020

  13. [13]

    S. Tao, F. Xiang, A. Shukla, Y . Qin, X. Hinrichsen, X. Yuan, C. Bao, X. Lin, Y . Liu, T. kai Chan, Y . Gao, X. Li, T. Mu, N. Xiao, A. Gurha, V . N. Rajesh, Y . W. Choi, Y .-R. Chen, Z. Huang, R. Calandra, R. Chen, S. Luo, and H. Su. Maniskill3: Gpu parallelized robotics simulation and rendering for generalizable embodied ai.Robotics: Science and Systems, 2025

  14. [14]

    X. Lin, Y . Wang, J. Olkin, and D. Held. Softgym: Benchmarking deep reinforcement learning for deformable object manipulation. InConference on Robot Learning, pages 432–448. PMLR, 2021. 9

  15. [15]

    Seita, P

    D. Seita, P. Florence, J. Tompson, E. Coumans, V . Sindhwani, K. Goldberg, and A. Zeng. Learning to rearrange deformable cables, fabrics, and bags with goal-conditioned transporter networks. In2021 IEEE International Conference on Robotics and Automation (ICRA), pages 4568–4575. IEEE, 2021

  16. [16]

    S. Chen, Y . Xu, C. Yu, L. Li, X. Ma, Z. Xu, and D. Hsu. Daxbench: Benchmarking deformable object manipulation with differentiable physics. InThe Eleventh International Conference on Learning Representations, 2023

  17. [17]

    Albrecht, Cillian Brewitt, John Wilhelm, Balint Gyevnar, Francisco Eiras, Mihai Dobre, and Subramanian Ramamoorthy

    R. Laezza, R. Gieselmann, F. T. Pokorny, and Y . Karayiannidis. Reform: A robot learning sandbox for deformable linear object manipulation. In2021 IEEE International Conference on Robotics and Automation (ICRA), pages 4717–4723, 2021. doi:10.1109/ICRA48506.2021. 9561766

  18. [18]

    J. Cao, Y . Wang, Z. Xiong, C. Lin, Z. Chen, and C. Gan. DLO-Lab: Benchmarking deformable linear object manipulations with differentiable physics. InProceedings of the 43rd Interna- tional Conference on Machine Learning (ICML), 2026. URLhttps://icml.cc/virtual/ 2026/poster/63391. Accepted

  19. [19]

    M. Li, H. Yu, Y . Huang, Y . Hong, C. Choi, and H. Ye. Hierarchical dlo routing with re- inforcement learning and in-context vision-language models, 2025. URLhttps://arxiv. org/abs/2510.19268

  20. [20]

    Tanureza, B

    J. Tanureza, B. Michalak, M. Radke, and K. Haninger. Industrial cabling in constrained en- vironments: a practical approach and current challenges. In2025 IEEE/SICE International Symposium on System Integration (SII), pages 1428–1433. IEEE, 2025

  21. [21]

    Kienle, B

    C. Kienle, B. Alt, F. Schneider, T. Pertlwieser, R. J ¨akel, and R. Rayyes. Ai-based framework for robust model-based connector mating in robotic wire harness installation. In2025 IEEE 21st International Conference on Automation Science and Engineering (CASE), pages 2444–

  22. [22]

    Assembly Performance Metrics and Test Methods.https://www.nist.gov/el/intelligent-systems-division-73500/ robotic-grasping-and-manipulation-assembly/assembly, 2018

    National Institute of Standards and Technology. Assembly Performance Metrics and Test Methods.https://www.nist.gov/el/intelligent-systems-division-73500/ robotic-grasping-and-manipulation-assembly/assembly, 2018. Accessed: 2026- 05-22

  23. [23]

    Y . Chen, K. Kimble, E. H. Adelson, T. Asfour, P. Chanrungmaneekul, S. Chitta, Y . Chitambar, Z. Chen, K. Goldberg, D. Kragic, et al. Manipulationnet: An infrastructure for benchmark- ing real-world robot manipulation with physical skill challenges and embodied multimodal reasoning.arXiv preprint arXiv:2603.04363, 2026

  24. [24]

    Huang, Y

    Z. Huang, Y . Hu, T. Du, S. Zhou, H. Su, J. B. Tenenbaum, and C. Gan. Plasticinelab: A soft-body manipulation benchmark with differentiable physics. InInternational Conference on Learning Representations, 2021

  25. [25]

    Zhang, K

    Y . Zhang, K. S. Luck, F. Verdoja, V . Kyrki, and J. Pajarinen. Modesuite: Robot learning task suite for benchmarking mobile manipulation with deformable objects.IEEE Robotics and Automation Letters, 2026

  26. [26]

    Q. J. Chen, T. Bretl, and Q.-C. Pham. Accurate simulation and parameter identification of deformable linear objects using discrete elastic rods in generalized coordinates. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 20454– 20460, 2025. doi:10.1109/IROS60139.2025.11247160

  27. [27]

    Bergou, M

    M. Bergou, M. Wardetzky, S. Robinson, B. Audoly, and E. Grinspun. Discrete elastic rods. In ACM Siggraph 2008 Papers, pages 1–12. 2008. 10

  28. [28]

    Z. Sun, J. Zhu, and R. B. Fisher. DexDLO: Learning goal-conditioned dexterous policy for dynamic manipulation of deformable linear objects. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 16009–16015. IEEE, 2024. doi:10.1109/ICRA57147. 2024.10610754

  29. [29]

    Y . Chen, Y . Zhang, Z. Brei, T. Zhang, Y . Chen, J. Wu, and R. Vasudevan. Differentiable discrete elastic rods for real-time modeling of deformable linear objects. In P. Agrawal, O. Kroemer, and W. Burgard, editors,Proceedings of The 8th Conference on Robot Learning, volume 270 ofProceedings of Machine Learning Research, pages 2996–3014. PMLR, 2025. URLh...

  30. [30]

    Y . Chen, X. Wu, Y . Zong, Y . Chen, A. Li, B. Zhang, and R. Vasudevan. DEFT: Differen- tiable branched discrete elastic rods for modeling furcated DLOs in real-time.arXiv preprint arXiv:2502.15037, 2025. doi:10.48550/arXiv.2502.15037. URLhttps://arxiv.org/abs/ 2502.15037

  31. [31]

    Govoni, N

    A. Govoni, N. Zubair, S. Soprani, and G. Palli. Performance analysis of a mass-spring-damper deformable linear object model in robotic simulation frameworks. InEuropean Robotics Forum 2025, volume 36 ofSpringer Proceedings in Advanced Robotics, pages 187–192. Springer,

  32. [32]

    doi:10.1007/978-3-031-89471-8 29

  33. [33]

    M. Li, H. Yu, and C. Choi. Routing manipulation of deformable linear object using reinforce- ment learning and diffusion policy. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 01–07, 2025. doi:10.1109/ICRA55743.2025.11127451

  34. [34]

    T. Z. Zhao, J. Luo, O. Sushkov, R. Pevceviciute, N. Heess, J. Scholz, S. Schaal, and S. Levine. Offline meta-reinforcement learning for industrial insertion. In2022 IEEE International Con- ference on Robotics and Automation (ICRA), pages 6386–6393. IEEE, 2022

  35. [35]

    B. Cao, X. Zang, S. Li, X. Zhang, C. le Li, and J. Zhao. Robotic cable following and in- sertion with tactile sensing.IEEE Sensors Journal, 26:6216–6224, 2026. URLhttps: //api.semanticscholar.org/CorpusID:284252279

  36. [36]

    H. Zhou, S. Li, Q. Lu, and J. Qian. A practical solution to deformable linear object manip- ulation: A case study on cable harness connection. In2020 5th International Conference on Advanced Robotics and Mechatronics (ICARM), pages 329–333. IEEE, 2020

  37. [37]

    M. Yu, K. Lv, C. Wang, Y . Jiang, M. Tomizuka, and X. Li. Generalizable whole-body global manipulation of deformable linear objects by dual-arm robot in 3-d constrained environments. The International Journal of Robotics Research, 44(4):607–639, 2025

  38. [38]

    W. Li, Z. Zhu, D. Li, Z. Gong, N. Lv, H. Chen, and J. Li. An automated cable laying simulation method by robot for planar routing. InProceedings of the 2025 11th Annual International Conference on Network and Information Systems for Computers, ICNISC ’25, page 209–214, New York, NY , USA, 2025. Association for Computing Machinery. ISBN 9798400715839. doi:...

  39. [39]

    Chang, R

    P. Chang, R. Luo, M. Zolotas, and T. Padır. Manipulation of deformable linear objects in benchmark task spaces. In2022 IEEE 18th International Conference on Automation Science and Engineering (CASE), pages 1910–1916, 2022. doi:10.1109/CASE49997.2022.9926677

  40. [40]

    Kimble, J

    K. Kimble, J. Albrecht, M. Zimmerman, and J. Falco. Performance measures to benchmark the grasping, manipulation, and assembly of deformable objects typical to manufacturing ap- plications.Frontiers in Robotics and AI, 9:999348, 2022

  41. [41]

    Azulay, K

    O. Azulay, K. Kondap, J. Drake, S. Xie, H. Li, S. Chitta, and K. Goldberg. Motorcycle 1.0: Automating bimanual cable routing around fixtures on the nist task board. In2025 IEEE 21st International Conference on Automation Science and Engineering (CASE), pages 2636–2641. IEEE, 2025. 11

  42. [42]

    Schulman, F

    J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017

  43. [43]

    Vecerik, T

    M. Vecerik, T. Hester, J. Scholz, F. Wang, O. Pietquin, B. Piot, N. Heess, T. Roth¨orl, T. Lampe, and M. Riedmiller. Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards.arXiv preprint arXiv:1707.08817, 2017

  44. [44]

    T. Z. Zhao, V . Kumar, S. Levine, and C. Finn. Learning fine-grained bimanual manipulation with low-cost hardware.Robotics: Science and Systems (RSS), 2023. doi:10.15607/RSS.2023. XIX.016

  45. [45]

    C. Chi, Z. Xu, S. Feng, E. Cousineau, Y . Du, B. Burchfiel, R. Tedrake, and S. Song. Diffusion policy: Visuomotor policy learning via action diffusion.The International Journal of Robotics Research, 44(10-11):1684–1704, 2025

  46. [46]

    Peebles and S

    W. Peebles and S. Xie. Scalable diffusion models with transformers.2023 IEEE/CVF In- ternational Conference on Computer Vision (ICCV), pages 4172–4182, 2023. doi:10.1109/ iccv51070.2023.00387

  47. [47]

    Z. Hou, T. Zhang, Y . Xiong, H. Duan, H. Pu, R. Tong, C. Zhao, X. Zhu, Y . Qiao, J. Dai, et al. Dita: Scaling diffusion transformer for generalist vision-language-action policy. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7686– 7697, 2025

  48. [48]

    Black, N

    K. Black, N. Brown, et al.π 0.5: a vision-language-action model with open-world generaliza- tion. InProceedings of the 9th Conference on Robot Learning (CoRL), 2025. Oral Presentation

  49. [49]

    Hoang, H

    T. Hoang, H. Le, P. Becker, V . A. Ngo, and G. Neumann. Geometry-aware rl for manipu- lation of varying shapes and deformable objects. InInternational Conference on Learning Representations, volume 2025, pages 47376–47405, 2025

  50. [50]

    E. Xing, V . Luk, and J. Oh. Stabilizing reinforcement learning in differentiable multiphysics simulation. InInternational Conference on Learning Representations, volume 2025, pages 91165–91198, 2025

  51. [51]

    Mittal et al

    M. Mittal et al. Isaac lab: A gpu-accelerated simulation framework for multi-modal robot learning.arXiv preprint arXiv:2511.04831, 2025

  52. [52]

    Nvidia isaac sim.https://github.com/isaac-sim/IsaacSim, 2026

    NVIDIA. Nvidia isaac sim.https://github.com/isaac-sim/IsaacSim, 2026. URL https://docs.isaacsim.omniverse.nvidia.com/latest/index.html

  53. [53]

    Interacting with a Deformable Object

    Isaac Lab Project Developers. Interacting with a Deformable Object. Isaac Lab Doc- umentation, 2026. URLhttps://isaac-sim.github.io/IsaacLab/main/source/ tutorials/01_assets/run_deformable_object.html. Accessed: 2026-05-16

  54. [54]

    Soft Bodies

    NVIDIA. Soft Bodies. PhysX Documentation, 2024. URLhttps://nvidia-omniverse. github.io/PhysX/physx/5.4.0/docs/SoftBodies.html. Accessed: 2026-05-16

  55. [55]

    Oquab, T

    M. Oquab, T. Darcet, T. Moutakanni, H. V o, M. Szafraniec, V . Khalidov, P. Fernandez, D. Haz- iza, F. Massa, A. El-Nouby, et al. Dinov2: Learning robust visual features without supervision. Transactions on Machine Learning Research Journal, 2024

  56. [56]

    Jaegle, S

    A. Jaegle, S. Borgeaud, J.-B. Alayrac, C. Doersch, C. Ionescu, D. Ding, S. Koppula, D. Zoran, A. Brock, E. Shelhamer, et al. Perceiver io: A general architecture for structured inputs & outputs. InInternational Conference on Learning Representations, 2022

  57. [57]

    Pinto, M

    L. Pinto, M. Andrychowicz, P. Welinder, W. Zaremba, and P. Abbeel. Asymmetric actor critic for image-based robot learning. InRobotics: Science and Systems, 2018

  58. [58]

    Cadene, S

    R. Cadene, S. Alibert, A. Soare, Q. Gallouedec, A. Zouitine, S. Palma, P. Kooijmans, M. Ar- actingi, M. Shukor, D. Aubakirova, et al. Lerobot: State-of-the-art machine learning for real- world robotics in pytorch.https://github.com/huggingface/lerobot, 2024. 12 A Related Work We expand the related-work discussion from the main paper with a feature-level c...