RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
Journal of Fourier Analysis and Applications , volume=
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3representative citing papers
A diameter criterion tied to a potential function certifies convergence of difference inclusions, enabling discrete proofs for first-order optimization methods with diminishing steps.
RL4RLA is a reinforcement learning framework that discovers interpretable symbolic randomized linear algebra algorithms by combining curriculum learning and graph-based search to overcome sparse rewards and large search spaces.
citing papers explorer
-
Randomized Advantage Transformation (RAT): Computing Natural Policy Gradients via Direct Backpropagation
RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
-
Convergence of difference inclusions via a diameter criterion
A diameter criterion tied to a potential function certifies convergence of difference inclusions, enabling discrete proofs for first-order optimization methods with diminishing steps.
-
RL4RLA: Teaching ML to Discover Randomized Linear Algebra Algorithms Through Curriculum Design and Graph-Based Search
RL4RLA is a reinforcement learning framework that discovers interpretable symbolic randomized linear algebra algorithms by combining curriculum learning and graph-based search to overcome sparse rewards and large search spaces.