pith. sign in

hub

Fasttd3: Simple, fast, and capable reinforcement learning for humanoid control

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

hub tools

citation-role summary

background 2 baseline 1

citation-polarity summary

years

2026 10 2025 2

clear filters

representative citing papers

When Does Non-Uniform Replay Matter in Reinforcement Learning?

cs.LG · 2026-05-11 · unverdicted · novelty 5.0 · 3 refs

Non-uniform replay helps most when replay volume is low; high-entropy sampling remains important, and a truncated geometric distribution delivers better sample efficiency with negligible overhead.

Relative Entropy Pathwise Policy Optimization

cs.LG · 2025-07-15 · unverdicted · novelty 5.0

REPPO is an on-policy RL method that combines pathwise policy gradients with relative entropy constraints to achieve stable training and high sample efficiency without replay buffers.

citing papers explorer

Showing 12 of 12 citing papers.