pith. machine review for the scientific record. sign in

arxiv: 2603.05687 · v3 · submitted 2026-03-05 · 💻 cs.RO

Recognition: no theorem link

Contact-Grounded Policy: Dexterous Visuotactile Policy with Generative Contact Grounding

Authors on Pith no claims yet

Pith reviewed 2026-05-15 15:35 UTC · model grok-4.3

classification 💻 cs.RO
keywords dexterous manipulationvisuotactile policycontact groundingdiffusion modelcompliance controllermulti-finger handin-hand manipulationtool use
0
0 comments X

The pith

Contact-Grounded Policy improves dexterous manipulation by predicting state-tactile trajectories and mapping them to compliant controller targets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Contact-Grounded Policy (CGP) for contact-rich dexterous manipulation where success hinges on evolving multi-point contacts that depend on geometry and friction. CGP uses a conditional diffusion model to forecast future robot states and tactile feedback together in a compressed latent space. A learned contact-consistency mapping then turns those predictions into executable target states for a compliance controller. This explicit grounding lets the policy account for how its outputs interact with low-level contact dynamics. On a physical four-finger Allegro hand and a simulated five-finger Tesollo hand, CGP outperforms standard visuomotor and visuotactile diffusion baselines across in-hand manipulation, delicate grasping, and tool-use tasks.

Core claim

CGP grounds multi-point contacts by predicting coupled trajectories of actual robot state and tactile feedback with a conditional diffusion model in compressed latent space, then applies a learned contact-consistency mapping to convert the predicted state-tactile pairs into executable target robot states for a compliance controller that can realize the intended contacts.

What carries the argument

The learned contact-consistency mapping that converts predicted robot state-tactile pairs into executable targets for the compliance controller.

Load-bearing premise

The learned contact-consistency mapping will reliably convert predicted state-tactile pairs into executable targets that the compliance controller can realize without introducing new slip or instability.

What would settle it

Measure whether the compliance controller achieves the exact predicted contacts without added slip when executing the mapped targets versus baseline predictions on the physical Allegro hand during a delicate grasping trial.

Figures

Figures reproduced from arXiv: 2603.05687 by Amirhossein H. Memar, Ben Abbatematteo, Jom Preechayasomboon, Nick Colonnese, Sonny Chan, Yeping Wang, Zhengtong Xu.

Figure 1
Figure 1. Figure 1: Contact-Grounded Policy (CGP) enables fine-grained, contact-rich dexterous manipulation by grounding multi-point [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Schematic of contact grounding using a 3-DoF revolute [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of Contact-Grounded Policy (CGP). CGP grounds multi-point contacts by predicting coupled trajectories [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Teleoperation pipeline. We use a Meta Quest 3 headset [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Hand configuration predictions by the contact [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: Inference time comparison. Average time over 50 [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗
Figure 7
Figure 7. Figure 7: Ablation results of KL regularization for tactile com [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: VAE reconstruction examples on the validation set for tactile arrays. We show ground truth (top) and reconstruction [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: VAE reconstruction examples on the validation set for Digit360 tactile images. We show ground truth (top) and [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: 11 objects with different shapes and sizes used for [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Snapshot of real-world Contact-Grounded Policy rollouts, with overlaid target and actual robot states. Before contact, [PITH_FULL_IMAGE:figures/full_fig_p013_12.png] view at source ↗
read the original abstract

Contact-rich dexterous manipulation with multi-finger hands remains an open challenge in robotics because task success depends on multi-point contacts that continuously evolve and are highly sensitive to object geometry, frictional transitions, and slip. Recently, tactile-informed manipulation policies have shown promise. However, most use tactile signals as additional observations rather than modeling contact state or how their action outputs interact with low-level controller dynamics. We present Contact-Grounded Policy (CGP), a visuotactile policy that grounds multi-point contacts by predicting coupled trajectories of actual robot state and tactile feedback, and using a learned contact-consistency mapping to convert these predictions into executable target robot states for a compliance controller. CGP consists of two components: (i) a conditional diffusion model that forecasts future robot state and tactile feedback in a compressed latent space, and (ii) a learned contact-consistency mapping that converts the predicted robot state-tactile pair into executable targets for a compliance controller, enabling it to realize the intended contacts. We evaluate CGP using a physical four-finger Allegro V5 hand with Digit360 fingertip tactile sensors, and a simulated five-finger Tesollo DG-5F hand with dense whole-hand tactile arrays. Across a range of dexterous tasks including in-hand manipulation, delicate grasping, and tool use, CGP outperforms visuomotor and visuotactile diffusion-policy baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces Contact-Grounded Policy (CGP), a visuotactile policy for dexterous manipulation. It employs a conditional diffusion model to forecast coupled trajectories of robot state and tactile feedback in latent space, paired with a learned contact-consistency mapping that translates these predictions into target states for a compliance controller. The approach is evaluated on physical and simulated multi-finger hands across in-hand manipulation, delicate grasping, and tool use tasks, claiming superior performance over visuomotor and visuotactile diffusion baselines.

Significance. If the empirical claims hold under rigorous verification, CGP could advance contact-rich dexterous manipulation by explicitly modeling evolving multi-point contacts and grounding predictions in controller dynamics. The dual physical-simulated evaluation and use of generative modeling for state-tactile forecasting are positive elements that target key sensitivities to geometry, friction, and slip.

major comments (2)
  1. [Evaluation] The abstract states that CGP outperforms baselines across tasks but supplies no quantitative metrics, error bars, ablation results, or training-data distribution details; this renders the central performance claim unverifiable from the provided text and weakens assessment of statistical reliability.
  2. [Method] The contact-consistency mapping is presented as converting diffusion-predicted state-tactile pairs into executable compliance-controller targets, yet no derivation, stability bound, or analysis is given showing preservation of contact geometry and friction constraints under controller dynamics (particularly for rapid frictional transitions or evolving multi-point contacts).
minor comments (2)
  1. [Method] Provide explicit architecture details for the diffusion model (noise schedule, latent dimensions) and contact-consistency network (training loss, input/output mappings) to support reproducibility.
  2. [Evaluation] Clarify the exact composition of the visuomotor and visuotactile diffusion-policy baselines, including whether they share the same diffusion backbone or controller.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments on the evaluation and methodological aspects of our work. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Evaluation] The abstract states that CGP outperforms baselines across tasks but supplies no quantitative metrics, error bars, ablation results, or training-data distribution details; this renders the central performance claim unverifiable from the provided text and weakens assessment of statistical reliability.

    Authors: We acknowledge that the abstract does not include specific quantitative metrics. The full paper provides detailed results with error bars from repeated trials, ablation studies, and information on the training data distribution in Sections 4 and 5. To strengthen the abstract's verifiability, we will add key performance metrics, such as average success rates with standard deviations and notes on ablations, to the revised abstract. revision: yes

  2. Referee: [Method] The contact-consistency mapping is presented as converting diffusion-predicted state-tactile pairs into executable compliance-controller targets, yet no derivation, stability bound, or analysis is given showing preservation of contact geometry and friction constraints under controller dynamics (particularly for rapid frictional transitions or evolving multi-point contacts).

    Authors: The contact-consistency mapping is a neural network trained end-to-end to ensure that the diffusion model's predictions correspond to achievable states under the compliance controller, thereby preserving the intended contact geometry and friction properties as demonstrated in our physical and simulated experiments. While we do not provide a formal mathematical derivation or stability bounds in the current version, we will include additional analysis in the revised manuscript discussing how the mapping maintains contact constraints, supported by empirical observations on frictional transitions and multi-point contacts. revision: partial

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents an empirical pipeline: a conditional diffusion model is trained on observed state-tactile trajectories to forecast future pairs in latent space, after which a separate learned contact-consistency mapping converts those predictions into compliance-controller targets. Neither component is defined in terms of the other, nor is any fitted parameter relabeled as a prediction; both are trained on external data and evaluated against independent baselines on physical and simulated hardware. No self-citation chain, uniqueness theorem, or ansatz is invoked to force the central performance claims, so the derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 1 invented entities

The central claim rests on standard assumptions of diffusion-model training (Gaussian noise schedule, Markovian reverse process) and the existence of a sufficiently expressive compliance controller. No new physical axioms or invented entities beyond the learned contact-consistency mapping are introduced.

free parameters (2)
  • diffusion noise schedule parameters
    Standard hyperparameters of the conditional diffusion model that are tuned during training.
  • contact-consistency network weights
    Learned parameters that map predicted state-tactile pairs to controller targets.
axioms (1)
  • domain assumption The compliance controller can realize any target pose within its workspace without instability when the target is within the learned mapping's output distribution.
    Invoked when the contact-consistency mapping produces executable targets.
invented entities (1)
  • contact-consistency mapping no independent evidence
    purpose: Converts predicted robot-state and tactile pairs into executable controller targets.
    New learned module introduced by the paper; no independent evidence outside the training data is provided.

pith-pipeline@v0.9.0 · 5580 in / 1497 out tokens · 36351 ms · 2026-05-15T15:35:40.004090+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 3 internal anchors

  1. [1]

    Dexterous manipulation through imitation learning: A survey.arXiv preprint arXiv:2504.03515, 2025

    Shan An, Ziyu Meng, Chao Tang, Yuning Zhou, Tengyu Liu, Fangqiang Ding, Shufang Zhang, Yao Mu, Ran Song, Wei Zhang, et al. Dexterous manipulation through imitation learning: A survey.arXiv preprint arXiv:2504.03515, 2025

  2. [2]

    Licrom: Linear-subspace continuous reduced order modeling with neural fields

    Yue Chang, Peter Yichen Chen, Zhecheng Wang, Mau- rizio M Chiaramonte, Kevin Carlberg, and Eitan Grin- spun. Licrom: Linear-subspace continuous reduced order modeling with neural fields. InSIGGRAPH Asia 2023 Conference Papers, pages 1–12, 2023

  3. [3]

    Dexforce: Extracting force-informed actions from kinesthetic demonstrations for dexterous manipulation.IEEE Robotics and Automa- tion Letters, 10(6):6416–6423, 2025

    Claire Chen, Zhongchun Yu, Hojung Choi, Mark Cutkosky, and Jeannette Bohg. Dexforce: Extracting force-informed actions from kinesthetic demonstrations for dexterous manipulation.IEEE Robotics and Automa- tion Letters, 10(6):6416–6423, 2025. doi: 10.1109/LRA. 2025.3568318

  4. [4]

    Dif- fusion policy: Visuomotor policy learning via action diffusion

    Cheng Chi, Siyuan Feng, Yilun Du, Zhenjia Xu, Eric Cousineau, Benjamin Burchfiel, and Shuran Song. Dif- fusion policy: Visuomotor policy learning via action diffusion. InProceedings of Robotics: Science and Systems, 2023

  5. [5]

    In-the-Wild Compliant Manipulation with UMI-FT

    Hojung Choi, Yifan Hou, Chuer Pan, Seongheon Hong, Austin Patel, Xiaomeng Xu, Mark R Cutkosky, and Shuran Song. In-the-wild compliant manipulation with umi-ft.arXiv preprint arXiv:2601.09988, 2026

  6. [6]

    Anydex- grasp: General dexterous grasping for different hands with human-level learning efficiency.arXiv preprint arXiv:2502.16420, 2025

    Hao-Shu Fang, Hengxu Yan, Zhenyu Tang, Hongjie Fang, Chenxi Wang, and Cewu Lu. Anydex- grasp: General dexterous grasping for different hands with human-level learning efficiency.arXiv preprint arXiv:2502.16420, 2025

  7. [7]

    Online optical marker-based hand tracking with deep labels.Acm Transactions on Graphics (TOG), 37(4):1–10, 2018

    Shangchen Han, Beibei Liu, Robert Wang, Yuting Ye, Christopher D Twigg, and Kenrick Kin. Online optical marker-based hand tracking with deep labels.Acm Transactions on Graphics (TOG), 37(4):1–10, 2018

  8. [8]

    Umetrack: Unified multi-view end-to-end hand tracking for vr

    Shangchen Han, Po-chen Wu, Yubo Zhang, Beibei Liu, Linguang Zhang, Zheng Wang, Weiguang Si, Peizhao Zhang, Yujun Cai, Tomas Hodan, et al. Umetrack: Unified multi-view end-to-end hand tracking for vr. In SIGGRAPH Asia 2022 Conference Papers, pages 1–9, 2022

  9. [9]

    ViTacFormer: Learning Cross-Modal Representation for Visuo-Tactile Dexterous Manipulation

    Liang Heng, Haoran Geng, Kaifeng Zhang, Pieter Abbeel, and Jitendra Malik. Vitacformer: Learning cross- modal representation for visuo-tactile dexterous manipu- lation.arXiv preprint arXiv:2506.15953, 2025

  10. [10]

    Denoising diffusion probabilistic models.Advances in Neural Information Processing Systems, 2020

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Advances in Neural Information Processing Systems, 2020

  11. [11]

    Adaptive compliance policy: Learning approximate compliance for diffusion guided control

    Yifan Hou, Zeyi Liu, Cheng Chi, Eric Cousineau, Naveen Kuppuswamy, Siyuan Feng, Benjamin Burchfiel, and Shuran Song. Adaptive compliance policy: Learning approximate compliance for diffusion guided control. In IEEE International Conference on Robotics and Automa- tion (ICRA), pages 4829–4836, 2025

  12. [12]

    3d-vitac: Learning fine-grained ma- nipulation with visuo-tactile sensing

    Binghao Huang, Yixuan Wang, Xinyi Yang, Yiyue Luo, and Yunzhu Li. 3d-vitac: Learning fine-grained ma- nipulation with visuo-tactile sensing. In8th Annual Conference on Robot Learning, 2024

  13. [13]

    Multimodal Diffusion Forcing for Forceful Manipulation

    Zixuan Huang, Huaidian Hou, and Dmitry Berenson. Unified multimodal diffusion forcing for forceful manip- ulation.arXiv preprint arXiv:2511.04812, 2025

  14. [14]

    Sampling-based exploration for reinforcement learning of dexterous manipulation

    Gagan Khandate, Siqi Shang, Eric T Chang, Tristan Luca Saidi, Yang Liu, Seth Matthew Dennis, Johnson Adams, and Matei Ciocarlie. Sampling-based exploration for reinforcement learning of dexterous manipulation. In Proceedings of Robotics: Science and Systems, 2023

  15. [15]

    Auto-Encoding Variational Bayes

    Diederik P Kingma and Max Welling. Auto-encoding variational bayes.arXiv preprint arXiv:1312.6114, 2013

  16. [16]

    Digitiz- ing touch with an artificial multimodal fingertip.arXiv preprint arXiv:2411.02479, 2024

    Mike Lambeta, Tingfan Wu, Ali Sengul, Victoria Rose Most, Nolan Black, Kevin Sawyer, Romeo Mercado, Haozhi Qi, Alexander Sohn, Byron Taylor, et al. Digitiz- ing touch with an artificial multimodal fingertip.arXiv preprint arXiv:2411.02479, 2024

  17. [17]

    Twisting lids off with two hands

    Toru Lin, Zhao-Heng Yin, Haozhi Qi, Pieter Abbeel, and Jitendra Malik. Twisting lids off with two hands. In Conference on Robot Learning, 2024

  18. [18]

    Factr: Force-attending curriculum training for contact-rich pol- icy learning

    Jason Jingzhou Liu, Yulong Li, Kenneth Shaw, Tony Tao, Ruslan Salakhutdinov, and Deepak Pathak. Factr: Force-attending curriculum training for contact-rich pol- icy learning. InProceedings of Robotics: Science and Systems, 2025

  19. [19]

    Mla: A multisensory language-action model for multimodal understanding and forecasting in robotic manipulation.arXiv preprint arXiv:2509.26642, 2025

    Zhuoyang Liu, Jiaming Liu, Jiadong Xu, Nuowei Han, Chenyang Gu, Hao Chen, Kaichen Zhou, Renrui Zhang, Kai Chin Hsieh, Kun Wu, et al. Mla: A multisensory language-action model for multimodal understanding and forecasting in robotic manipulation.arXiv preprint arXiv:2509.26642, 2025

  20. [20]

    Film: Visual reasoning with a general conditioning layer

    Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. Film: Visual reasoning with a general conditioning layer. InProceedings of the AAAI Conference on Artificial Intelligence, 2018

  21. [21]

    From simple to complex skills: The case of in-hand object reorientation

    Haozhi Qi, Brent Yi, Mike Lambeta, Yi Ma, Roberto Calandra, and Jitendra Malik. From simple to complex skills: The case of in-hand object reorientation. InIEEE International Conference on Robotics and Automation (ICRA), 2025

  22. [22]

    Re- laxedik: Real-time synthesis of accurate and feasible robot arm motion

    Daniel Rakita, Bilge Mutlu, and Michael Gleicher. Re- laxedik: Real-time synthesis of accurate and feasible robot arm motion. InRobotics: Science and Systems, volume 14, pages 26–30, 2018

  23. [23]

    High-resolution image synthesis with latent diffusion models

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022

  24. [24]

    Learning contact deformations with general collider descriptors

    Cristian Romero, Dan Casas, Maurizio Chiaramonte, and Miguel A Otaduy. Learning contact deformations with general collider descriptors. InSIGGRAPH Asia 2023 Conference Papers, pages 1–10, 2023

  25. [25]

    De- noising diffusion implicit models.Proceedings of Inter- national Conference on Learning Representations, 2021

    Jiaming Song, Chenlin Meng, and Stefano Ermon. De- noising diffusion implicit models.Proceedings of Inter- national Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=St1giarCHLP

  26. [26]

    Interpolated adaptive linear reduced order modeling for deformation dynamics.arXiv preprint arXiv:2509.25392, 2025

    Yutian Tao, Maurizio Chiaramonte, and Pablo Fernandez. Interpolated adaptive linear reduced order modeling for deformation dynamics.arXiv preprint arXiv:2509.25392, 2025

  27. [27]

    deoxys control

    UT-Austin-RPL. deoxys control. GitHub Repository,

  28. [28]

    URL https://github.com/UT-Austin-RPL/deoxys control

  29. [29]

    Rangedik: An optimization-based robot motion generation method for ranged-goal tasks

    Yeping Wang, Pragathi Praveena, Daniel Rakita, and Michael Gleicher. Rangedik: An optimization-based robot motion generation method for ranged-goal tasks. arXiv preprint arXiv:2302.13935, 2023

  30. [30]

    DexUMI: Us- ing human hand as the universal manipulation interface for dexterous manipulation

    Mengda Xu, Han Zhang, Yifan Hou, Zhenjia Xu, Linxi Fan, Manuela Veloso, and Shuran Song. DexUMI: Us- ing human hand as the universal manipulation interface for dexterous manipulation. In9th Annual Conference on Robot Learning, 2025. URL https://openreview.net/ forum?id=XrgRvBklWu

  31. [31]

    Compliant residual DAgger: Improving real-world contact-rich manipulation with human corrections

    Xiaomeng Xu, Yifan Hou, Zeyi Liu, and Shuran Song. Compliant residual DAgger: Improving real-world contact-rich manipulation with human corrections. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/ forum?id=cjcm5LYVWm

  32. [32]

    Unidexgrasp: Universal robotic dexterous grasping via learning diverse proposal generation and goal-conditioned policy

    Yinzhen Xu, Weikang Wan, Jialiang Zhang, Haoran Liu, Zikang Shan, Hao Shen, Ruicheng Wang, Haoran Geng, Yijia Weng, Jiayi Chen, et al. Unidexgrasp: Universal robotic dexterous grasping via learning diverse proposal generation and goal-conditioned policy. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4737–4746, 2023

  33. [33]

    Unit: Data efficient tactile representation with generalization to unseen objects.IEEE Robotics and Automation Letters, 2025

    Zhengtong Xu, Raghava Uppuluri, Xinwei Zhang, Cael Fitch, Philip Glen Crandall, Wan Shou, Dongyi Wang, and Yu She. Unit: Data efficient tactile representation with generalization to unseen objects.IEEE Robotics and Automation Letters, 2025

  34. [34]

    Reactive dif- fusion policy: Slow-fast visual-tactile policy learning for contact-rich manipulation

    Han Xue, Jieji Ren, Wendi Chen, Gu Zhang, Yuan Fang, Guoying Gu, Huazhe Xu, and Cewu Lu. Reactive dif- fusion policy: Slow-fast visual-tactile policy learning for contact-rich manipulation. InProceedings of Robotics: Science and Systems, 2025

  35. [35]

    Dex1b: Learning with 1b demonstrations for dexterous manipulation

    Jianglong Ye, Keyi Wang, Chengjing Yuan, Ruihan Yang, Yiquan Li, Jiyue Zhu, Yuzhe Qin, Xueyan Zou, and Xi- aolong Wang. Dex1b: Learning with 1b demonstrations for dexterous manipulation. InProceedings of Robotics: Science and Systems, 2025

  36. [36]

    Rotating without seeing: Towards in-hand dexterity through touch

    Zhao-Heng Yin, Binghao Huang, Yuzhe Qin, Qifeng Chen, and Xiaolong Wang. Rotating without seeing: Towards in-hand dexterity through touch. InProceedings of Robotics: Science and Systems, 2023

  37. [37]

    Kinedex: Learning tactile- informed visuomotor policies via kinesthetic teaching for dexterous manipulation

    Di Zhang, Chengbo Yuan, Chuan Wen, Hai Zhang, Junqiao Zhao, and Yang Gao. Kinedex: Learning tactile- informed visuomotor policies via kinesthetic teaching for dexterous manipulation. In9th Annual Conference on Robot Learning, 2025. URL https://openreview.net/ forum?id=GKueYvjqSS

  38. [38]

    Dexgraspnet 2.0: Learning generative dexterous grasping in large-scale synthetic cluttered scenes

    Jialiang Zhang, Haoran Liu, Danshi Li, XinQiang Yu, Haoran Geng, Yufei Ding, Jiayi Chen, and He Wang. Dexgraspnet 2.0: Learning generative dexterous grasping in large-scale synthetic cluttered scenes. In8th Annual Conference on Robot Learning, 2024

  39. [39]

    Polytouch: A robust multi-modal tactile sensor for contact-rich manip- ulation using tactile-diffusion policies

    Jialiang Zhao, Naveen Kuppuswamy, Siyuan Feng, Ben- jamin Burchfiel, and Edward Adelson. Polytouch: A robust multi-modal tactile sensor for contact-rich manip- ulation using tactile-diffusion policies. InIEEE Interna- tional Conference on Robotics and Automation (ICRA), pages 104–110, 2025. doi: 10.1109/ICRA55743.2025. 11128816

  40. [40]

    On the continuity of rotation representations in neural networks

    Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, and Hao Li. On the continuity of rotation representations in neural networks. InProceedings of IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, pages 5745–5753, 2019

  41. [41]

    Viola: Imitation learning for vision-based manipulation with object proposal pri- ors

    Yifeng Zhu and Abhishek Joshi. Viola: Imitation learning for vision-based manipulation with object proposal pri- ors. InProceedings of Conference on Robot Learning, 2022

  42. [42]

    Neural stress fields for reduced-order elastoplasticity and frac- ture

    Zeshun Zong, Xuan Li, Minchen Li, Maurizio M Chiara- monte, Wojciech Matusik, Eitan Grinspun, Kevin Carl- berg, Chenfanfu Jiang, and Peter Yichen Chen. Neural stress fields for reduced-order elastoplasticity and frac- ture. InSIGGRAPH Asia 2023 Conference Papers, pages 1–11, 2023. APPENDIXA ADDITIONALTASKDETAILS Table V summarizes the task, training, and ...