Timed reward machines extend reward machines with timing constraints, allowing model-free RL algorithms to learn policies that satisfy precise temporal requirements on standard benchmarks.
Klassen, Richard Valenzano, and Sheila A
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
representative citing papers
Combines LTL formal methods with LLMs for auditing, predictive monitoring, and runtime intervention on temporally extended behavioral constraints, outperforming LLM baselines and reducing violations.
citing papers explorer
-
About Time: Model-free Reinforcement Learning with Timed Reward Machines
Timed reward machines extend reward machines with timing constraints, allowing model-free RL algorithms to learn policies that satisfy precise temporal requirements on standard benchmarks.
-
Formal Methods Meet LLMs: Auditing, Monitoring, and Intervention for Compliance of Advanced AI Systems
Combines LTL formal methods with LLMs for auditing, predictive monitoring, and runtime intervention on temporally extended behavioral constraints, outperforming LLM baselines and reducing violations.
- Reward Shaping and Action Masking for Compositional Tasks using Behavior Trees and LLMs