Disentangling Transfer in Continual Reinforcement Learning

{\L}ukasz Kuci\'nski; Maciej Wo{\l}czyk; Micha{\l} Zaj\k{a}c; Piotr Mi{\l}o\'s; Razvan Pascanu

arxiv: 2209.13900 · v1 · pith:OTL5L3CTnew · submitted 2022-09-28 · 💻 cs.LG

Disentangling Transfer in Continual Reinforcement Learning

Maciej Wo{\l}czyk , Micha{\l} Zaj\k{a}c , Razvan Pascanu , {\L}ukasz Kuci\'nski , Piotr Mi{\l}o\'s This is my paper

classification 💻 cs.LG

keywords continualtransferlearningtasksworldbenchmarkbestclonex-sac

0 comments

read the original abstract

The ability of continual learning systems to transfer knowledge from previously seen tasks in order to maximize performance on new tasks is a significant challenge for the field, limiting the applicability of continual learning solutions to realistic scenarios. Consequently, this study aims to broaden our understanding of transfer and its driving forces in the specific case of continual reinforcement learning. We adopt SAC as the underlying RL algorithm and Continual World as a suite of continuous control tasks. We systematically study how different components of SAC (the actor and the critic, exploration, and data) affect transfer efficacy, and we provide recommendations regarding various modeling options. The best set of choices, dubbed ClonEx-SAC, is evaluated on the recent Continual World benchmark. ClonEx-SAC achieves 87% final success rate compared to 80% of PackNet, the best method in the benchmark. Moreover, the transfer grows from 0.18 to 0.54 according to the metric provided by Continual World.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning
cs.AI 2023-06 conditional novelty 8.0

LIBERO is a new benchmark for lifelong robot learning that evaluates transfer of declarative, procedural, and mixed knowledge across 130 manipulation tasks with provided demonstration data.