pith. sign in

Switching-Geometry Analysis of Deflated Q-Value Iteration

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

This paper develops a joint spectral radius (JSR) framework for analyzing rank-one deflated Q-value iteration (Q-VI) in discounted Markov decision process control. Focusing on an all-ones residual correction, we interpret the resulting algorithm through the geometry of switching systems and, to the best of our knowledge, give the first JSR-based convergence analysis of deflated Q-VI for policy optimization problems. Our analysis reveals that the standard Q-VI switching system model has JSR exactly the discount factor $\gamma\in (0,1)$, since all admissible subsystems share the all-ones vector as an invariant direction. By passing to the quotient space that removes this direction, we obtain a projected switching system model whose JSR governs the relevant error dynamics and may be strictly smaller than $\gamma$. Therefore, the deflated Q-VI admits a potentially sharper convergence-rate characterization than the ambient-space $\gamma$-bound. Finally, we prove that the correction is equivalent to a scalar recentering of standard Q-VI. Hence, the projected trajectory, and therefore the greedy-policy sequence, is unchanged relative to standard Q-VI initialized from the same point. The benefit of deflation is not a change in the induced decision-making problem, but a more precise JSR-based description of the convergence geometry after the redundant all-ones component is removed.

fields

cs.LG 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

Heavy-Ball Q-Learning with Residual Weighting Correction

cs.LG · 2026-06-25 · unverdicted · novelty 7.0

Corrected heavy-ball Q-learning with convergence and acceleration guarantees is derived via switched linear system and joint spectral radius analysis, extended to linear function approximation.

citing papers explorer

Showing 1 of 1 citing paper.

  • Heavy-Ball Q-Learning with Residual Weighting Correction cs.LG · 2026-06-25 · unverdicted · none · ref 11 · internal anchor

    Corrected heavy-ball Q-learning with convergence and acceleration guarantees is derived via switched linear system and joint spectral radius analysis, extended to linear function approximation.