Program-Based Strategy Induction for Reinforcement Learning

Carlos G. Correa; Nathaniel D. Daw; Thomas L. Griffiths

arxiv: 2402.16668 · v1 · pith:7YQSP2KYnew · submitted 2024-02-26 · 💻 cs.LG · cs.AI

Program-Based Strategy Induction for Reinforcement Learning

Carlos G. Correa , Thomas L. Griffiths , Nathaniel D. Daw This is my paper

classification 💻 cs.LG cs.AI

keywords strategieslearninglikemodelsdifficultdiscreteincrementalinduction

0 comments

read the original abstract

Typical models of learning assume incremental estimation of continuously-varying decision variables like expected rewards. However, this class of models fails to capture more idiosyncratic, discrete heuristics and strategies that people and animals appear to exhibit. Despite recent advances in strategy discovery using tools like recurrent networks that generalize the classic models, the resulting strategies are often onerous to interpret, making connections to cognition difficult to establish. We use Bayesian program induction to discover strategies implemented by programs, letting the simplicity of strategies trade off against their effectiveness. Focusing on bandit tasks, we find strategies that are difficult or unexpected with classical incremental learning, like asymmetric learning from rewarded and unrewarded trials, adaptive horizon-dependent random exploration, and discrete state switching.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Path-dependent program induction under resource constraints explains human sequence learning
cs.AI 2026-05 unverdicted novelty 6.0

A hierarchical program induction model under memory and computation constraints accounts for human melodic sequence learning errors, reaction times, and continuation choices better than fixed grammars or shallow chunking.