Towards general-purpose model-free reinforcement learning

Scott Fujimoto, Pierluca D’Oro, Amy Zhang, Yuandong Tian, Michael Rabbat · 2025 · arXiv 2501.16142

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 2 baseline 1

citation-polarity summary

background 2 baseline 1

representative citing papers

Multi-scale Predictive Representations for Goal-conditioned Reinforcement Learning

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

Ms.PR applies multi-scale predictive supervision to enforce goal-directed alignment in latent spaces for offline GCRL, yielding improved representation quality and performance on vision and state-based tasks.

FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control

cs.LG · 2026-04-06 · unverdicted · novelty 6.0 · 2 refs

FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.

Robust Remote Reinforcement Learning over Unreliable Communication Channels using Homomorphic State Encoding

cs.LG · 2025-08-11 · unverdicted · novelty 6.0

HR3L enables robust remote RL training over unreliable channels via homomorphic state encoding without gradient exchange, outperforming prior methods in sample efficiency and adapting to packet loss, delays, and bandwidth limits.

When Does Non-Uniform Replay Matter in Reinforcement Learning?

cs.LG · 2026-05-11 · unverdicted · novelty 5.0 · 3 refs

Non-uniform replay helps most when replay volume is low; high-entropy sampling remains important, and a truncated geometric distribution delivers better sample efficiency with negligible overhead.

citing papers explorer

Showing 4 of 4 citing papers.

Multi-scale Predictive Representations for Goal-conditioned Reinforcement Learning cs.LG · 2026-05-10 · unverdicted · none · ref 11
Ms.PR applies multi-scale predictive supervision to enforce goal-directed alignment in latent spaces for offline GCRL, yielding improved representation quality and performance on vision and state-based tasks.
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control cs.LG · 2026-04-06 · unverdicted · none · ref 17 · 2 links
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
Robust Remote Reinforcement Learning over Unreliable Communication Channels using Homomorphic State Encoding cs.LG · 2025-08-11 · unverdicted · none · ref 32
HR3L enables robust remote RL training over unreliable channels via homomorphic state encoding without gradient exchange, outperforming prior methods in sample efficiency and adapting to packet loss, delays, and bandwidth limits.
When Does Non-Uniform Replay Matter in Reinforcement Learning? cs.LG · 2026-05-11 · unverdicted · none · ref 10 · 3 links
Non-uniform replay helps most when replay volume is low; high-entropy sampling remains important, and a truncated geometric distribution delivers better sample efficiency with negligible overhead.

Towards general-purpose model-free reinforcement learning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer