Learning Gentle Object Manipulation with Curiosity-Driven Deep Reinforcement Learning

· 2019 · cs.RO · arXiv 1903.08542

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Robots must know how to be gentle when they need to interact with fragile objects, or when the robot itself is prone to wear and tear. We propose an approach that enables deep reinforcement learning to train policies that are gentle, both during exploration and task execution. In a reward-based learning environment, a natural approach involves augmenting the (task) reward with a penalty for non-gentleness, which can be defined as excessive impact force. However, augmenting with only this penalty impairs learning: policies get stuck in a local optimum which avoids all contact with the environment. Prior research has shown that combining auxiliary tasks or intrinsic rewards can be beneficial for stabilizing and accelerating learning in sparse-reward domains, and indeed we find that introducing a surprise-based intrinsic reward does avoid the no-contact failure case. However, we show that a simple dynamics-based surprise is not as effective as penalty-based surprise. Penalty-based surprise, based on predicting forceful contacts, has a further benefit: it encourages exploration which is contact-rich yet gentle. We demonstrate the effectiveness of the approach using a complex, tendon-powered robot hand with tactile sensors. Videos are available at http://sites.google.com/view/gentlemanipulation.

representative citing papers

Learning Hybrid-Control Policies for High-Precision In-Contact Manipulation Under Uncertainty

cs.RO · 2026-04-21 · unverdicted · novelty 7.0

MATCH trains hybrid position-force RL policies that achieve up to 10% higher success rates and 5x fewer breaks than pose-only policies in fragile peg-in-hole tasks under localization uncertainty, with strong sim-to-real results.

Deterministic Pareto-Optimal Policy Synthesis for Multi-Objective Reinforcement Learning

cs.LG · 2026-06-24 · unverdicted · novelty 5.0

Introduces a Chebyshev-motivated Bellman operator that provably envelopes and converges to a coverage set of the Pareto frontier in MOMDPs while allowing extraction of deterministic policies for any preference.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Learning Hybrid-Control Policies for High-Precision In-Contact Manipulation Under Uncertainty cs.RO · 2026-04-21 · unverdicted · none · ref 22
MATCH trains hybrid position-force RL policies that achieve up to 10% higher success rates and 5x fewer breaks than pose-only policies in fragile peg-in-hole tasks under localization uncertainty, with strong sim-to-real results.
Deterministic Pareto-Optimal Policy Synthesis for Multi-Objective Reinforcement Learning cs.LG · 2026-06-24 · unverdicted · none · ref 1 · internal anchor
Introduces a Chebyshev-motivated Bellman operator that provably envelopes and converges to a coverage set of the Pareto frontier in MOMDPs while allowing extraction of deterministic policies for any preference.

Learning Gentle Object Manipulation with Curiosity-Driven Deep Reinforcement Learning

fields

years

verdicts

representative citing papers

citing papers explorer