A Switching System Theory of Q-Learning with Linear Function Approximation

· 2026 · cs.LG · arXiv 2605.11021

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

This paper develops a switching-system interpretation of Q-learning with linear function approximation (LFA) based on the joint spectral radius (JSR). We derive an exact linear switched model for the mean dynamics and relate convergence to stability of the corresponding switched system. The same construction is then used for stochastic linear Q-learning with independent and identically distributed (i.i.d.) observations and with Markovian observations. Although exact JSR computation is difficult in general, the certificate captures products of switching modes and can be less conservative than one-step norm bounds. The framework also yields a JSR-based view of regularized Q-learning with LFA. The resulting analysis connects projected Bellman equations, finite-difference stochastic-policy switching, and switched-system stability in a single parameter-space formulation.

representative citing papers

Target Updates May Stabilize Linear Q-Learning: Periodic and Soft Dynamics

stat.ML · 2026-05-31 · unverdicted · novelty 7.0

Periodic and soft target updates guarantee convergence in linear Q-learning to the exact projected Q-Bellman solution under spectral and step-size conditions via joint spectral radius analysis of switched linear systems.

Geometrically Averaged Hard Target Updates for Linear Q-Learning

cs.LG · 2026-06-09 · unverdicted · novelty 6.0

Introduces and analyzes the λ-target update for linear Q-learning via geometric averaging of periodic target maps, studied with a switching-system model in the deterministic case.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Target Updates May Stabilize Linear Q-Learning: Periodic and Soft Dynamics stat.ML · 2026-05-31 · unverdicted · none · ref 17 · internal anchor
Periodic and soft target updates guarantee convergence in linear Q-learning to the exact projected Q-Bellman solution under spectral and step-size conditions via joint spectral radius analysis of switched linear systems.
Geometrically Averaged Hard Target Updates for Linear Q-Learning cs.LG · 2026-06-09 · unverdicted · none · ref 16 · internal anchor
Introduces and analyzes the λ-target update for linear Q-learning via geometric averaging of periodic target maps, studied with a switching-system model in the deterministic case.

A Switching System Theory of Q-Learning with Linear Function Approximation

fields

years

verdicts

representative citing papers

citing papers explorer