Gridworld agents trained with DReST reward functions learn to be USEFUL at tasks conditional on trajectory length and NEUTRAL across lengths, supplying initial evidence that the method could produce shutdownable agents.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2024 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Towards Shutdownable Agents via Stochastic Choice
Gridworld agents trained with DReST reward functions learn to be USEFUL at tasks conditional on trajectory length and NEUTRAL across lengths, supplying initial evidence that the method could produce shutdownable agents.