Soft Bellman residual minimization with weighted Lp-norm aligns the objective with Bellman contraction as p increases and yields performance error bounds.
Why should i trust you, bellman? the bellman error is a poor replacement for value error
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Koopman-assisted RL reformulates max-entropy algorithms using controlled Koopman tensors and reports SOTA performance versus neural SAC on Lorenz, fluid flow, and other systems.
citing papers explorer
-
Contraction-Aligned Analysis of Soft Bellman Residual Minimization with Weighted Lp-Norm for Markov Decision Problem
Soft Bellman residual minimization with weighted Lp-norm aligns the objective with Bellman contraction as p increases and yields performance error bounds.
-
Koopman-Assisted Reinforcement Learning
Koopman-assisted RL reformulates max-entropy algorithms using controlled Koopman tensors and reports SOTA performance versus neural SAC on Lorenz, fluid flow, and other systems.