Timed reward machines extend reward machines with timing constraints, allowing model-free RL algorithms to learn policies that satisfy precise temporal requirements on standard benchmarks.
and Valenzano, Richard and McIlraith, Sheila A
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
citing papers explorer
-
About Time: Model-free Reinforcement Learning with Timed Reward Machines
Timed reward machines extend reward machines with timing constraints, allowing model-free RL algorithms to learn policies that satisfy precise temporal requirements on standard benchmarks.
- Reward Shaping and Action Masking for Compositional Tasks using Behavior Trees and LLMs