pith. the verified trust layer for science. sign in

arxiv: 1803.04996 · v2 · pith:G7AW7MWNnew · submitted 2018-03-13 · 💻 cs.RO

Comparing Task Simplifications to Learn Closed-Loop Object Picking Using Deep Reinforcement Learning

classification 💻 cs.RO
keywords policiescurriculumlearningobjectsrewardtaskapproachesclosed-loop
0
0 comments X p. Extension
Add this Pith Number to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{G7AW7MWN}

Prints a linked pith:G7AW7MWN badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

Enabling autonomous robots to interact in unstructured environments with dynamic objects requires manipulation capabilities that can deal with clutter, changes, and objects' variability. This paper presents a comparison of different reinforcement learning-based approaches for object picking with a robotic manipulator. We learn closed-loop policies mapping depth camera inputs to motion commands and compare different approaches to keep the problem tractable, including reward shaping, curriculum learning and using a policy pre-trained on a task with a reduced action set to warm-start the full problem. For efficient and more flexible data collection, we train in simulation and transfer the policies to a real robot. We show that using curriculum learning, policies learned with a sparse reward formulation can be trained at similar rates as with a shaped reward. These policies result in success rates comparable to the policy initialized on the simplified task. We could successfully transfer these policies to the real robot with only minor modifications of the depth image filtering. We found that using a heuristic to warm-start the training was useful to enforce desired behavior, while the policies trained from scratch using a curriculum learned better to cope with unseen scenarios where objects are removed.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. One Interface, Many Robots: Unified Real-Time Low-Level Motion Planning for Collaborative Arms

    cs.RO 2026-04 unverdicted novelty 4.0

    A middleware interface employs n-degree polynomial interpolation and quadratic programming to produce smooth, real-time end-effector trajectories for collaborative arms, validated in offline drawing, dynamic grasping,...