Sutton and Andrew G

Richard S · 2018

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

browse 7 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Gaussian Approximation for Asynchronous Q-learning

stat.ML · 2026-04-08 · unverdicted · novelty 7.0

Derived rates of order up to n^{-1/6} log^4(n S A) for the high-dimensional CLT of averaged asynchronous Q-learning iterates, plus a general martingale-difference CLT.

Emergence of agriculture in an artificial society of reinforcement learning agents

cs.MA · 2026-05-21 · unverdicted · novelty 6.0

Agriculture emerges spontaneously in an RL agent society through planning for delayed rewards, social learning that counters cheaters, and an irreversible lock-in effect.

The Streaming Reservoir Convergence Theorem: A Prospect-Theoretic Framework for Multi-Provider Adaptive Streaming

cs.MM · 2026-05-04 · unverdicted · novelty 6.0

SRCT models streaming as concurrent reservoir filling with k standby streams, proving harmonic uptime bounds, 3-5x acquisition speedup, monotonic quality convergence, and a prospect-theoretic no-thrash switching rule.

Hyperfastrl: Hypernetwork-based reinforcement learning for unified control of parametric chaotic PDEs

cs.CE · 2026-04-07 · unverdicted · novelty 6.0

Hypernetworks map a forcing parameter directly to policy weights in an RL framework, enabling unified stabilization of the Kuramoto-Sivashinsky equation across regimes with KAN architectures showing strongest extrapolation.

Why Code, Why Now: An Information-Theoretic Perspective on the Limits of Machine Learning

cs.LG · 2026-02-15 · unverdicted · novelty 6.0

Task information structure determines ML scaling success, with code's dense verifiable signals enabling predictable progress while sparse-feedback tasks like typical RL do not.

On Gaussian approximation for entropy-regularized Q-learning with function approximation

stat.ML · 2026-05-17 · unverdicted · novelty 5.0

Establishes n^{-1/4} Gaussian approximation in convex distance for averaged entropy-regularized Q-learning with linear function approximation and polynomial stepsizes.

Hybrid-AIRL: Enhancing Inverse Reinforcement Learning with Supervised Expert Guidance

cs.LG · 2025-11-26 · unverdicted · novelty 5.0

Hybrid-AIRL adds supervised expert guidance and stochastic regularization to AIRL, yielding higher sample efficiency and more stable learning on Gymnasium benchmarks and Heads-Up Limit Hold'em poker.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

Sutton and Andrew G

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer