Anytime-constrained reinforcement learning,

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Bellman Value Decomposition for Task Logic in Safe Optimal Control

cs.RO · 2026-02-23 · unverdicted · novelty 7.0

Bellman values for temporal logic tasks decompose into a graph of reach-avoid, avoid, and reach-avoid-loop equations solved by embedding the graph in a two-layer neural net (VDPPO) for safe high-dimensional control.

citing papers explorer

Showing 1 of 1 citing paper.

Bellman Value Decomposition for Task Logic in Safe Optimal Control cs.RO · 2026-02-23 · unverdicted · none · ref 38
Bellman values for temporal logic tasks decompose into a graph of reach-avoid, avoid, and reach-avoid-loop equations solved by embedding the graph in a two-layer neural net (VDPPO) for safe high-dimensional control.

Anytime-constrained reinforcement learning,

fields

years

verdicts

representative citing papers

citing papers explorer