pith. sign in

Title resolution pending

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

fields

cs.RO 4 cs.LG 1

representative citing papers

EXPO: Stable Reinforcement Learning with Expressive Policies

cs.LG · 2025-07-10 · conditional · novelty 7.0

EXPO stabilizes online RL for expressive policies by training a base policy with imitation and using a lightweight Gaussian edit policy to select higher-value actions on the fly for sampling and TD backups.

Diffusion Policy Policy Optimization

cs.RO · 2024-09-01 · unverdicted · novelty 6.0

DPPO fine-tunes diffusion policies via policy gradients and outperforms prior RL approaches for diffusion policies and PG-tuned alternatives on robot benchmarks while enabling stable training and hardware deployment.

HDFlow: Hierarchical Diffusion-Flow Planning for Long-horizon Tasks

cs.RO · 2026-05-06 · unverdicted · novelty 5.0 · 2 refs

HDFlow pairs a high-level diffusion planner for strategic subgoals with a low-level rectified flow planner for efficient trajectories, claiming superior performance on furniture assembly and other long-horizon robotic benchmarks.

citing papers explorer

Showing 5 of 5 citing papers.

  • EXPO: Stable Reinforcement Learning with Expressive Policies cs.LG · 2025-07-10 · conditional · none · ref 1

    EXPO stabilizes online RL for expressive policies by training a base policy with imitation and using a lightweight Gaussian edit policy to select higher-value actions on the fly for sampling and TD backups.

  • Steering Your Diffusion Policy with Latent Space Reinforcement Learning cs.RO · 2025-06-18 · unverdicted · none · ref 64

    DSRL steers pretrained diffusion policies for robotics by applying RL to their latent noise inputs, achieving sample-efficient real-world adaptation with only black-box access.

  • RoHIL: Robust Human-in-the-Loop Robotic Reinforcement Learning Against Illumination Variations cs.RO · 2026-05-19 · unverdicted · none · ref 2

    RoHIL adapts human-in-the-loop RL policies to new illumination conditions offline by combining world-model image relighting, illumination-retention replay, and anchored Bellman regularisation, improving shifted-light performance while preserving source performance on four real-robot tasks.

  • Diffusion Policy Policy Optimization cs.RO · 2024-09-01 · unverdicted · none · ref 6

    DPPO fine-tunes diffusion policies via policy gradients and outperforms prior RL approaches for diffusion policies and PG-tuned alternatives on robot benchmarks while enabling stable training and hardware deployment.

  • HDFlow: Hierarchical Diffusion-Flow Planning for Long-horizon Tasks cs.RO · 2026-05-06 · unverdicted · none · ref 1 · 2 links

    HDFlow pairs a high-level diffusion planner for strategic subgoals with a low-level rectified flow planner for efficient trajectories, claiming superior performance on furniture assembly and other long-horizon robotic benchmarks.