pith. sign in

arxiv: 1902.09996 · v1 · pith:XH5PU5CNnew · submitted 2019-02-26 · 💻 cs.AI · cs.LG· stat.ML

The Termination Critic

classification 💻 cs.AI cs.LGstat.ML
keywords terminationconditionoptionoptionsabstractionscriticlearningmodel
0
0 comments X p. Extension
pith:XH5PU5CN Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{XH5PU5CN}

Prints a linked pith:XH5PU5CN badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

In this work, we consider the problem of autonomously discovering behavioral abstractions, or options, for reinforcement learning agents. We propose an algorithm that focuses on the termination condition, as opposed to -- as is common -- the policy. The termination condition is usually trained to optimize a control objective: an option ought to terminate if another has better value. We offer a different, information-theoretic perspective, and propose that terminations should focus instead on the compressibility of the option's encoding -- arguably a key reason for using abstractions. To achieve this algorithmically, we leverage the classical options framework, and learn the option transition model as a "critic" for the termination condition. Using this model, we derive gradients that optimize the desired criteria. We show that the resulting options are non-trivial, intuitively meaningful, and useful for learning and planning.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Joint Learning of Hierarchical Neural Options and Abstract World Model

    cs.LG 2026-02 unverdicted novelty 5.0

    AgentOWL jointly learns an abstract world model and hierarchical neural options to acquire more skills with less data and better generalization than baselines on Object-Centric Atari games.