Stationary reweighting of soft fitted Q-iteration yields finite-sample local linear convergence to the projected fixed point under approximate realizability and controlled weighting error, even without Bellman completeness.
A natural way to ensure this is to begin with a large softmax temperature τ, for which the contraction radius r0(τ) grows as τ 1/α while the soft–hard bias scales only as O(τ)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
stat.ML 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Stationary Reweighting Yields Local Convergence of Soft Fitted Q-Iteration
Stationary reweighting of soft fitted Q-iteration yields finite-sample local linear convergence to the projected fixed point under approximate realizability and controlled weighting error, even without Bellman completeness.