Ms.PR applies multi-scale predictive supervision to enforce goal-directed alignment in latent spaces for offline GCRL, yielding improved representation quality and performance on vision and state-based tasks.
Towards general-purpose model-free reinforcement learning
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.LG 4verdicts
UNVERDICTED 4representative citing papers
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
HR3L enables robust remote RL training over unreliable channels via homomorphic state encoding without gradient exchange, outperforming prior methods in sample efficiency and adapting to packet loss, delays, and bandwidth limits.
Non-uniform replay helps most when replay volume is low; high-entropy sampling remains important, and a truncated geometric distribution delivers better sample efficiency with negligible overhead.
citing papers explorer
-
Multi-scale Predictive Representations for Goal-conditioned Reinforcement Learning
Ms.PR applies multi-scale predictive supervision to enforce goal-directed alignment in latent spaces for offline GCRL, yielding improved representation quality and performance on vision and state-based tasks.
-
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
-
Robust Remote Reinforcement Learning over Unreliable Communication Channels using Homomorphic State Encoding
HR3L enables robust remote RL training over unreliable channels via homomorphic state encoding without gradient exchange, outperforming prior methods in sample efficiency and adapting to packet loss, delays, and bandwidth limits.
-
When Does Non-Uniform Replay Matter in Reinforcement Learning?
Non-uniform replay helps most when replay volume is low; high-entropy sampling remains important, and a truncated geometric distribution delivers better sample efficiency with negligible overhead.