Learning Object Manipulation from Scratch via Contrastive Interaction
Pith reviewed 2026-06-27 10:08 UTC · model grok-4.3
The pith
Interaction-weighted resampling lets contrastive RL capture multi-modal reachability in object manipulation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Manipulation dynamics form a piecewise-smooth Markov process whose interaction events induce distinct mode changes; these changes produce piecewise nonlinear reachability that standard CRL energy functions cannot represent or plan over. Interaction-weighted Resampling counters this by performing interaction-aware resampling around the pre-, during-, and post-interaction phases, so that the learned representation preserves the mode boundaries that determine future reachability.
What carries the argument
Interaction-weighted Resampling (IWR): interaction-aware resampling of trajectories around phases before, during, and after contact events to preserve mode boundaries in the contrastive energy function.
If this is right
- IWR raises average success by 19.8 percent over prior CRL baselines across interaction-centric simulation environments.
- Policies trained with IWR transfer via sim-to-real to produce the first goal-conditioned real-world robot air-hockey agent, lifting success from 25 percent to 60 percent.
- The same resampling principle applies to any goal-conditioned task whose dynamics contain contact-induced mode switches.
Where Pith is reading between the lines
- The same mode-boundary preservation idea could be tested in non-contact but still multi-modal domains such as deformable-object handling.
- If the piecewise-smooth assumption holds, IWR should also improve long-horizon planning that must cross several interaction events.
- A direct test would be to measure how accurately the learned energy function recovers the true mode-transition times in a controlled contact task.
Load-bearing premise
Manipulation dynamics behave as a piecewise-smooth Markov process whose interaction events create distinct mode boundaries that determine future reachability.
What would settle it
Train a standard CRL agent and an IWR agent on the same air-hockey or manipulation task; if the two agents achieve statistically indistinguishable success rates after equal numbers of samples, the claim that IWR is required to handle piecewise nonlinear reachability would be falsified.
Figures
read the original abstract
Contrastive Reinforcement Learning (CRL) has seen recent success in a wide variety of goal-conditioned robotics tasks by learning structured representations of the dynamics. However, despite its success in locomotion and simpler control domains, CRL often struggles in interaction-rich manipulation. We argue that a key source of this difficulty is object-centric interaction, such as contact or grasping, that induces distinct changes in the underlying dynamic modes. In this work, we formulate manipulation dynamics as a piecewise-smooth Markov process and show that interaction-induced mode changes create piecewise nonlinear reachability structures that are difficult for standard CRL energy functions to represent and plan over. Based on this analysis, we introduce Interaction-weighted Resampling (IWR). IWR performs interaction-aware resampling around phases before, during, and after interactions, encouraging the learned representation to preserve the mode boundaries that determine future reachability to capture multi-modal and piecewise nonlinear reachability. Across interaction-centric environments, including 2D dynamic control, robotic manipulation, and robot air hockey, IWR improves both sample efficiency and overall performance over prior CRL methods, with 19.8% average improvement in simulation. Finally, using a sim-to-real pipeline with policies trained by IWR, we demonstrate the first real-world goal-conditioned robot air hockey agent capable of hitting goals, improving success from 25% to 60%. Project Page: IWR-arxiv.github.io.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that standard Contrastive Reinforcement Learning (CRL) struggles in interaction-rich manipulation tasks because object-centric interactions (contact, grasping) induce piecewise-smooth Markov dynamics whose mode switches produce piecewise nonlinear reachability structures that standard CRL energy functions cannot represent or plan over. It introduces Interaction-weighted Resampling (IWR), which performs interaction-aware resampling around pre-, during-, and post-interaction phases to preserve mode boundaries in the learned representation. Empirical results are reported as a 19.8% average improvement over prior CRL methods across 2D control, robotic manipulation, and air-hockey environments in simulation, plus a sim-to-real transfer yielding the first goal-conditioned real-robot air-hockey policy with success rising from 25% to 60%.
Significance. If the central empirical gains are shown to arise specifically from mode-boundary preservation rather than generic resampling effects, the work would supply a concrete, interaction-timed mechanism for improving CRL representations in contact-rich settings and would demonstrate the first real-world goal-conditioned air-hockey agent. The sim-to-real result is a notable strength if the policy transfer details and baseline comparisons are fully documented.
major comments (2)
- [Abstract] Abstract, paragraph 2: The assertion that 'interaction-induced mode changes create piecewise nonlinear reachability structures that are difficult for standard CRL energy functions to represent and plan over' is presented without a derivation, explicit counter-example, or side-by-side comparison demonstrating that existing CRL contrastive objectives are provably or empirically incapable of capturing these structures. This assumption is load-bearing for the motivation of IWR.
- [Abstract] Abstract: The reported 19.8% average improvement and the sim-to-real success-rate increase (25% to 60%) are stated without error bars, number of independent runs, ablation of the piecewise-smooth assumption, or controls that isolate the contribution of interaction-timed resampling from generic resampling effects. These omissions prevent verification that the gains stem from the claimed preservation of mode boundaries.
Simulated Author's Rebuttal
We thank the referee for their detailed review and constructive comments on our work. Below we address each major comment point-by-point. We believe the revisions will strengthen the presentation of our contributions.
read point-by-point responses
-
Referee: [Abstract] Abstract, paragraph 2: The assertion that 'interaction-induced mode changes create piecewise nonlinear reachability structures that are difficult for standard CRL energy functions to represent and plan over' is presented without a derivation, explicit counter-example, or side-by-side comparison demonstrating that existing CRL contrastive objectives are provably or empirically incapable of capturing these structures. This assumption is load-bearing for the motivation of IWR.
Authors: We thank the referee for highlighting this. The full manuscript in Section 3 formalizes the dynamics as a piecewise-smooth Markov process and provides a derivation of how mode switches lead to piecewise nonlinear reachability that standard contrastive objectives struggle with due to their smoothness assumptions. We also include empirical evidence in the experiments showing standard CRL methods' limitations in these tasks. To make this more explicit as requested, we will add a simple counter-example in the revised introduction demonstrating the failure mode of standard CRL energy functions on a toy piecewise-smooth system. revision: partial
-
Referee: [Abstract] Abstract: The reported 19.8% average improvement and the sim-to-real success-rate increase (25% to 60%) are stated without error bars, number of independent runs, ablation of the piecewise-smooth assumption, or controls that isolate the contribution of interaction-timed resampling from generic resampling effects. These omissions prevent verification that the gains stem from the claimed preservation of mode boundaries.
Authors: The abstract is a high-level summary; the full results with error bars from 5 independent runs, number of trials, and detailed ablations are presented in Sections 5 and 6, including controls comparing IWR to generic resampling methods that do not preserve mode boundaries. The sim-to-real results include 20 trials per condition with success rates reported. We agree that the abstract could better reference these details and will revise it to mention the statistical rigor and ablations supporting the mode-boundary preservation claim. revision: yes
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper formulates manipulation as a piecewise-smooth Markov process and argues that interaction mode changes produce reachability structures difficult for standard CRL energies, then introduces IWR resampling as a design choice to preserve mode boundaries. No equations or steps reduce a claimed prediction or first-principles result to fitted inputs by construction, nor does any load-bearing premise collapse to a self-citation chain. The 19.8% empirical improvement is reported as an experimental outcome rather than a re-expression of training quantities, and the central assumption is presented without internal redefinition loops.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption manipulation dynamics can be formulated as a piecewise-smooth Markov process
Reference graph
Works this paper leans on
-
[1]
M. Vecerik, T. Hester, J. Scholz, F. Wang, O. Pietquin, B. Piot, N. Heess, T. Rothörl, T. Lampe, and M. Riedmiller. Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards.arXiv preprint arXiv:1707.08817, 2017
Pith/arXiv arXiv 2017
-
[2]
Zitkovich, T
B. Zitkovich, T. Yu, S. Xu, P. Xu, T. Xiao, F. Xia, J. Wu, P. Wohlhart, S. Welker, A. Wahid, et al. Rt-2: Vision-language-action models transfer web knowledge to robotic control. InConference on Robot Learning, pages 2165–2183. PMLR, 2023
2023
-
[3]
M. J. Kim, K. Pertsch, S. Karamcheti, T. Xiao, A. Balakrishna, S. Nair, R. Rafailov, E. Foster, G. Lam, P. Sanketi, et al. Openvla: An open-source vision-language-action model.arXiv preprint arXiv:2406.09246, 2024
Pith/arXiv arXiv 2024
-
[4]
C. Chi, S. Feng, Z. Xu, E. A. Cousineau, B. Burchfiel, S. Song, et al. Visuomotor policy learning via action diffusion, Sept. 4 2025. US Patent App. 18/594,842
2025
-
[5]
O’Neill, A
A. O’Neill, A. Rehman, A. Maddukuri, A. Gupta, A. Padalkar, A. Lee, A. Pooley, A. Gupta, A. Mandlekar, A. Jain, et al. Open x-embodiment: Robotic learning datasets and rt-x models: Open x-embodiment collaboration 0. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 6892–6903. IEEE, 2024
2024
-
[6]
Y . J. Ma, W. Liang, G. Wang, D.-A. Huang, O. Bastani, D. Jayaraman, Y . Zhu, J. Fan, et al. Eureka: Human-level reward design via coding large language models. InInternational conference on learning Representations, volume 2024, pages 26516–26560, 2024
2024
-
[7]
C. Tang, B. Abbatematteo, J. Hu, R. Chandra, R. Martín-Martín, and P. Stone. Deep rein- forcement learning for robotics: A survey of real-world successes.Annual Review of Control, Robotics, and Autonomous Systems, 8(1):153–188, 2025
2025
-
[8]
B. Eysenbach, A. Gupta, J. Ibarz, and S. Levine. Diversity is all you need: Learning skills without a reward function.arXiv preprint arXiv:1802.06070, 2018
Pith/arXiv arXiv 2018
-
[9]
Touati and Y
A. Touati and Y . Ollivier. Learning one representation to optimize all rewards.Advances in Neural Information Processing Systems, 34:13–23, 2021
2021
-
[10]
Laskin, H
M. Laskin, H. Liu, X. B. Peng, D. Yarats, A. Rajeswaran, and P. Abbeel. Unsupervised reinforcement learning with contrastive intrinsic control.Advances in Neural Information Processing Systems, 35:34478–34491, 2022
2022
-
[11]
S. Park, O. Rybkin, and S. Levine. Metra: Scalable unsupervised rl with metric-aware ab- straction. InInternational Conference on Learning Representations, volume 2024, pages 18579–18603, 2024
2024
-
[12]
Agarwal, C
S. Agarwal, C. Chuck, H. Sikchi, J. Hu, M. Rudolph, S. Niekum, P. Stone, and A. Zhang. A unified framework for unsupervised reinforcement learning algorithms. InWorkshop on Reinforcement Learning Beyond Rewards@ Reinforcement Learning Conference 2025, 2025
2025
-
[13]
M. Liu, M. Zhu, and W. Zhang. Goal-conditioned reinforcement learning: Problems and solutions.arXiv preprint arXiv:2201.08299, 2022
arXiv 2022
-
[14]
Eysenbach, T
B. Eysenbach, T. Zhang, S. Levine, and R. Salakhutdinov. Contrastive learning as goal- conditioned reinforcement learning. InAdvances in Neural Information Processing Systems, volume 35, pages 35603–35620, 2022. URL https://proceedings.neurips.cc/paper _files/paper/2022/file/e7663e974c4ee7a2b475a4775201ce1f-Paper-Conferenc e.pdf
2022
-
[15]
G. Liu, M. Tang, and B. Eysenbach. A single goal is all you need: Skills and exploration emerge from contrastive rl without rewards, demonstrations, or subgoals. InInternational Conference on Learning Representations, volume 2025, pages 78599–78621, 2025. 10
2025
-
[16]
Eysenbach, V
B. Eysenbach, V . Myers, R. Salakhutdinov, and S. Levine. Inference via interpolation: Con- trastive representations provably enable planning and inference.Advances in Neural Information Processing Systems, 37:58901–58928, 2024
2024
- [17]
- [18]
- [19]
-
[20]
A. Lei, B. Schölkopf, and I. Posner. Spartan: A sparse transformer world model attending to what matters.Advances in Neural Information Processing Systems, 38:154089–154114, 2025
2025
-
[21]
J. Kim, D. Hwang, E. Lee, J. Suh, J. Kim, and W. Rhee. Enhancing contrastive learning with efficient combinatorial positive pairing.arXiv preprint arXiv:2401.05730, 2024
arXiv 2024
-
[22]
Ziarko, M
A. Ziarko, M. Bortkiewicz, M. Zawalski, B. Eysenbach, and P. Miło ´s. Contrastive repre- sentations for temporal reasoning.Advances in Neural Information Processing Systems, 38: 109229–109259, 2026
2026
-
[23]
A. v. d. Oord, Y . Li, and O. Vinyals. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018. URLhttps://arxiv.org/abs/1807.03748
Pith/arXiv arXiv 2018
-
[24]
Eysenbach, R
B. Eysenbach, R. Salakhutdinov, and S. Levine. C-learning: Learning to achieve goals via recursive classification. InInternational Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=tc5qisoB-C
2021
-
[25]
B. C. Zheng, V . Myers, B. Eysenbach, and S. Levine. Multistep quasimetric learning for scalable goal-conditioned reinforcement learning.arXiv preprint arXiv:2511.07730, 2025
arXiv 2025
-
[26]
Myers, B
V . Myers, B. Zheng, B. Eysenbach, and S. Levine. Offline goal-conditioned reinforcement learning with quasimetric representations.Advances in Neural Information Processing Systems, 38:19654–19679, 2026
2026
-
[27]
Myers, A
V . Myers, A. W. He, K. Fang, H. R. Walke, P. Hansen-Estruch, C.-A. Cheng, M. Jalobeanu, A. Kolobov, A. Dragan, and S. Levine. Goal representations for instruction following: A semi- supervised language interface to control. InConference on Robot Learning, pages 3894–3908. PMLR, 2023
2023
-
[28]
Myers, B
V . Myers, B. Zheng, A. Dragan, K. Fang, and S. Levine. Temporal representation alignment: Successor features enable emergent compositionality in robot instruction following.Advances in Neural Information Processing Systems, 38:149934–149961, 2026
2026
-
[29]
Y . Wang, O. Bounou, G. Zhou, R. Balestriero, T. G. Rudner, Y . LeCun, and M. Ren. Temporal straightening for latent planning.arXiv preprint arXiv:2603.12231, 2026
arXiv 2026
-
[30]
K. Wang, I. Javali, M. Bortkiewicz, T. Trzcinski, and B. Eysenbach. 1000 layer networks for self-supervised rl: Scaling depth can enable new goal-reaching capabilities.Advances in Neural Information Processing Systems, 38:157643–157670, 2026
2026
-
[31]
M. Bastankhah, G. Liu, D. Arumugam, T. L. Griffiths, and B. Eysenbach. Demystifying the mechanisms behind emergent exploration in goal-conditioned rl. InInternational Conference on Learning Representations, 2026. URLhttps://arxiv.org/abs/2510.14129. 11
arXiv 2026
-
[32]
M. L. Puterman. Markov decision processes.Handbooks in operations research and manage- ment science, 2:331–434, 1990
1990
-
[33]
L. P. Kaelbling. Learning to achieve goals. InIJCAI, volume 2, pages 1094–8. Citeseer, 1993
1993
-
[34]
Andrychowicz, F
M. Andrychowicz, F. Wolski, A. Ray, J. Schneider, R. Fong, P. Welinder, B. McGrew, J. Tobin, O. Pieter Abbeel, and W. Zaremba. Hindsight experience replay.Advances in neural information processing systems, 30, 2017
2017
-
[35]
C. Bai, L. Wang, Y . Wang, Z. Wang, R. Zhao, C. Bai, and P. Liu. Addressing hindsight bias in multigoal reinforcement learning.IEEE Transactions on Cybernetics, 53(1):392–405, 2021
2021
-
[36]
Feng and I
C. Feng and I. Patras. Adaptive soft contrastive learning. In2022 26th International Conference on Pattern Recognition (ICPR), pages 2721–2727. IEEE, 2022
2022
-
[37]
Denize, J
J. Denize, J. Rabarisoa, A. Orcesi, R. Hérault, and S. Canu. Similarity contrastive estimation for self-supervised soft contrastive learning. InProceedings of the IEEE/CVF winter conference on applications of computer vision, pages 2706–2716, 2023
2023
-
[38]
Khosla, P
P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y . Tian, P. Isola, A. Maschinot, C. Liu, and D. Krishnan. Supervised contrastive learning.Advances in neural information processing systems, 33:18661–18673, 2020
2020
-
[39]
D. T. Hoffmann, N. Behrmann, J. Gall, T. Brox, and M. Noroozi. Ranking info noise contrastive estimation: Boosting contrastive learning via ranked positives. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 897–905, 2022
2022
-
[40]
Chuang, J
C.-Y . Chuang, J. Robinson, Y .-C. Lin, A. Torralba, and S. Jegelka. Debiased contrastive learning. Advances in neural information processing systems, 33:8765–8775, 2020
2020
-
[41]
Huynh, S
T. Huynh, S. Kornblith, M. R. Walter, M. Maire, and M. Khademi. Boosting contrastive self-supervised learning with false negative cancellation. InProceedings of the IEEE/CVF winter conference on applications of computer vision, pages 2785–2795, 2022
2022
-
[42]
Dwibedi, Y
D. Dwibedi, Y . Aytar, J. Tompson, P. Sermanet, and A. Zisserman. With a little help from my friends: Nearest-neighbor contrastive learning of visual representations. InProceedings of the IEEE/CVF international conference on computer vision, pages 9588–9597, 2021
2021
- [43]
-
[44]
D. Han, B. Mulyana, V . Stankovic, and S. Cheng. A survey on deep reinforcement learning algorithms for robotic manipulation.Sensors, 23(7):3762, 2023
2023
- [45]
-
[46]
Seitzer, B
M. Seitzer, B. Schölkopf, and G. Martius. Causal influence detection for improving efficiency in reinforcement learning.Advances in Neural Information Processing Systems, 34:22905–22918, 2021
2021
-
[47]
Pitis, E
S. Pitis, E. Creager, and A. Garg. Counterfactual data augmentation using locally factored dynamics.Advances in Neural Information Processing Systems, 33:3976–3990, 2020
2020
-
[48]
Pitis, E
S. Pitis, E. Creager, A. Mandlekar, and A. Garg. Mocoda: Model-based counterfactual data augmentation.Advances in Neural Information Processing Systems, 35:18143–18156, 2022
2022
-
[49]
Chuck, S
C. Chuck, S. Chockchowwat, and S. Niekum. Hypothesis-driven skill discovery for hierarchical deep reinforcement learning. In2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 5572–5579. IEEE, 2020. 12
2020
-
[50]
Z. Wang, J. Hu, C. Chuck, S. Chen, R. Martín-Martín, A. Zhang, S. Niekum, and P. Stone. Skild: Unsupervised skill discovery guided by factor interactions.Advances in Neural Information Processing Systems, 37:77748–77776, 2024
2024
-
[51]
J. Hu, Z. Wang, P. Stone, and R. Martín-Martín. Disentangled unsupervised skill discovery for efficient hierarchical reinforcement learning.Advances in neural information processing systems, 37:76529–76552, 2024
2024
-
[52]
R. Rodriguez-Sanchez, C. Allen, and G. Konidaris. From pixels to factors: Learning indepen- dently controllable state variables for reinforcement learning.arXiv preprint arXiv:2510.02484, 2025
arXiv 2025
-
[53]
S. M. H. Hosseini and M. S. Baghshah. Susd: Structured unsupervised skill discovery through state factorization.arXiv preprint arXiv:2602.01619, 2026
Pith/arXiv arXiv 2026
-
[54]
T. E. Lee, S. Vats, S. Girdhar, and O. Kroemer. Scale: Causal learning and discovery of robot manipulation skills using simulation. InCoRL 2023 Workshop on Learning Effective Abstractions for Planning (LEAP), 2023
2023
-
[55]
Biswas, B
A. Biswas, B. A. Pardhi, C. Chuck, J. Holtz, S. Niekum, H. Admoni, and A. Allievi. Gaze supervision for mitigating causal confusion in driving agents. In2024 IEEE Intelligent Vehicles Symposium (IV), pages 2331–2338. IEEE, 2024
2024
-
[56]
Z. Wang, J. Hu, P. Stone, and R. Martín-Martín. Elden: Exploration via local dependencies. Advances in Neural Information Processing Systems, 36:15456–15474, 2023
2023
-
[58]
Guestrin, D
C. Guestrin, D. Koller, R. Parr, and S. Venkataraman. Efficient solution algorithms for factored mdps.Journal of Artificial Intelligence Research, 19:399–468, 2003
2003
-
[59]
G. Liu, M. Tang, and B. Eysenbach. A single goal is all you need: Skills and exploration emerge from contrastive rl without rewards, demonstrations, or subgoals. InInternational Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=xCkgX4Xfu0
2025
-
[60]
E. Catto. Box2D: A 2d physics engine for games. https://box2d.org/, 2026. Accessed: 2026-05-28
2026
-
[61]
Proximal Policy Optimization Algorithms
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017. doi:10.48550/arXiv.1707.06347. URL https://arxiv.org/abs/1707.06347
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1707.06347 2017
-
[62]
Haarnoja, A
T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. InProceedings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Research, pages 1861–1870. PMLR, 2018. URL https://proceedings.mlr.press/v80/haarnoja 18b.html
2018
- [63]
-
[64]
H. S. Sikchi, S. Agarwal, P. Jajoo, S. Parajuli, C. Chuck, M. Rudolph, P. Stone, A. Zhang, and S. Niekum. Rlzero: Direct policy inference from language without in-domain supervision. Advances in Neural Information Processing Systems, 38:83365–83398, 2026
2026
- [65]
-
[66]
S. Agarwal, H. Sikchi, P. Stone, and A. Zhang. Proto successor measure: Representing the behavior space of an rl agent.arXiv preprint arXiv:2411.19418, 2024
arXiv 2024
-
[67]
Zheng, R
C. Zheng, R. Salakhutdinov, and B. Eysenbach. Contrastive difference predictive coding. In International Conference on Learning Representations, volume 2024, pages 47577–47601, 2024
2024
-
[68]
A. Modirshanechi, B. Eysenbach, P. Dayan, and E. Schulz. Unifying goal-conditioned rl and unsupervised skill learning via control-maximization.arXiv preprint arXiv:2605.06145, 2026
Pith/arXiv arXiv 2026
-
[69]
Levy.Unsupervised Skill Discovery with Empowerment
A. Levy.Unsupervised Skill Discovery with Empowerment. PhD thesis, Brown University PROVIDENCE, RHODE ISLAND, 2025
2025
-
[70]
Ferns, P
N. Ferns, P. Panangaden, and D. Precup. Bisimulation metrics for continuous markov decision processes.SIAM Journal on Computing, 40(6):1662–1714, 2011
2011
- [71]
-
[72]
M. Rudolph, C. Chuck, K. Black, M. Lvovsky, S. Niekum, and A. Zhang. Learning action-based representations using invariance.arXiv preprint arXiv:2403.16369, 2024
arXiv 2024
-
[73]
J. Farebrother, M. Pirotta, A. Tirinzoni, M. G. Bellemare, A. Lazaric, and A. Touati. Composi- tional planning with jumpy world models.arXiv preprint arXiv:2602.19634, 2026
arXiv 2026
-
[74]
F. Feng, P. Lippe, and S. Magliacane. Learning interactive world model for object-centric reinforcement learning.Advances in Neural Information Processing Systems, 38:89827–89862, 2025
2025
-
[75]
Feng and S
F. Feng and S. Magliacane. Learning dynamic attribute-factored world models for efficient multi-object reinforcement learning.Advances in Neural Information Processing Systems, 36: 19117–19144, 2023
2023
-
[76]
M. Sieb, Z. Xian, A. Huang, O. Kroemer, and K. Fragkiadaki. Graph-structured visual imitation. InConference on Robot learning, pages 979–989. PMLR, 2020
2020
-
[77]
Huang, A
Y . Huang, A. Conkey, and T. Hermans. Planning for multi-object manipulation with graph neural network relational classifiers. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 1822–1829. IEEE, 2023
2023
-
[78]
Y . Lin, A. S. Wang, E. Undersander, and A. Rai. Efficient and interpretable robot manipulation with graph neural networks.IEEE Robotics and Automation Letters, 7(2):2740–2747, 2022
2022
- [79]
-
[80]
Z. Huang. Robocraft: Learning to see, simulate, and shape elasto-plastic objects with graph networks.Robotics: Science and Systems XVIII
-
[81]
Kedia, A
K. Kedia, A. Bhardwaj, P. Dan, and S. Choudhury. Interact: Transformer models for human intent prediction conditioned on robot actions. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 621–628. IEEE, 2024
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.