On the convergence of policy iteration in stationary dynamic programming.Mathematics of Operations Research, 4(1):60–69, 1979

Martin L Puterman, Shelby L Brumelle · 1979

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Stabilized neural Hamilton--Jacobi--Bellman solvers: Error analysis and applications in model-based reinforcement learning

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

The authors prove a population L2 stability estimate and finite-sample certificate for one policy-evaluation step in a neural HJB solver with learned dynamics, plus multi-step propagation through greedy improvement, with experiments on high-dimensional control tasks.

citing papers explorer

Showing 1 of 1 citing paper.

Stabilized neural Hamilton--Jacobi--Bellman solvers: Error analysis and applications in model-based reinforcement learning cs.LG · 2026-05-08 · unverdicted · none · ref 17
The authors prove a population L2 stability estimate and finite-sample certificate for one policy-evaluation step in a neural HJB solver with learned dynamics, plus multi-step propagation through greedy improvement, with experiments on high-dimensional control tasks.

On the convergence of policy iteration in stationary dynamic programming.Mathematics of Operations Research, 4(1):60–69, 1979

fields

years

verdicts

representative citing papers

citing papers explorer