Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem

Bart{\l}omiej Cupia{\l}; {\L}ukasz Kuci\'nski; Maciej Wo{\l}czyk; Mateusz Ostaszewski; Micha{\l} Bortkiewicz; Micha{\l} Zaj\k{a}c; Piotr Mi{\l}o\'s; Razvan Pascanu

arxiv: 2402.02868 · v3 · pith:UM7G25HQnew · submitted 2024-02-05 · 💻 cs.LG

Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem

Maciej Wo{\l}czyk , Bart{\l}omiej Cupia{\l} , Mateusz Ostaszewski , Micha{\l} Bortkiewicz , Micha{\l} Zaj\k{a}c , Razvan Pascanu , {\L}ukasz Kuci\'nski , Piotr Mi{\l}o\'s This is my paper

classification 💻 cs.LG

keywords fine-tuningmodelscapabilitiespre-trainedproblemtransferforgettinglearning

0 comments

read the original abstract

Fine-tuning is a widespread technique that allows practitioners to transfer pre-trained capabilities, as recently showcased by the successful applications of foundation models. However, fine-tuning reinforcement learning (RL) models remains a challenge. This work conceptualizes one specific cause of poor transfer, accentuated in the RL setting by the interplay between actions and observations: forgetting of pre-trained capabilities. Namely, a model deteriorates on the state subspace of the downstream task not visited in the initial phase of fine-tuning, on which the model behaved well due to pre-training. This way, we lose the anticipated transfer benefits. We identify conditions when this problem occurs, showing that it is common and, in many cases, catastrophic. Through a detailed empirical analysis of the challenging NetHack and Montezuma's Revenge environments, we show that standard knowledge retention techniques mitigate the problem and thus allow us to take full advantage of the pre-trained capabilities. In particular, in NetHack, we achieve a new state-of-the-art for neural models, improving the previous best score from $5$K to over $10$K points in the Human Monk scenario.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Hierarchical Behaviour Spaces
cs.AI 2026-04 unverdicted novelty 6.0

Hierarchical Behaviour Spaces uses linear combinations of reward functions to induce expressive behavior spaces in hierarchical RL, yielding strong performance on NetHack primarily through better exploration rather th...
VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning
cs.RO 2025-05 conditional novelty 6.0

VLA-RL applies online RL to pretrained VLAs, yielding a 4.5% gain over strong baselines on 40 LIBERO manipulation tasks and matching commercial models like π₀-FAST.
Behavioral Mode Discovery for Fine-tuning Multimodal Generative Policies
cs.LG 2026-05 unverdicted novelty 5.0

Unsupervised behavioral mode discovery combined with mutual information rewards enables RL fine-tuning of multimodal generative policies that achieves higher success rates without losing action diversity.
Augmenting Game AI with Deep Reinforcement Learning
cs.AI 2026-06 unverdicted novelty 4.0

Proposes a requirements-based framework for RL-augmented game AI, discusses deployment practicalities, and identifies research bottlenecks for industry adoption.