AdaScope adaptively selects optimal RL intervention points during diffusion denoising by monitoring structural and semantic changes, delivering 66% higher performance at 59% lower cost than full-trajectory RL baselines.
Integrating behavior cloning and reinforcement learning for improved performance in dense and sparse reward environments
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
DF-ExpEnse improves sample efficiency in finetuning diffusion-based robotic policies by filtering diffusion-generated actions with critic ensembles and enabling fleet-level collaboration.
TacCoRL integrates tactile feedback into VLA policies via real-aligned simulation co-training and RL, raising average success from 50% to 72.5% on four bimanual contact-rich tasks with direct real-robot transfer.
citing papers explorer
-
Do Less, Achieve More: Do We Need Every-Step Optimization for RL Fine-tuning of Diffusion Models?
AdaScope adaptively selects optimal RL intervention points during diffusion denoising by monitoring structural and semantic changes, delivering 66% higher performance at 59% lower cost than full-trajectory RL baselines.
-
DF-ExpEnse: Diffusion Filtered Exploration for Sample Efficient Finetuning
DF-ExpEnse improves sample efficiency in finetuning diffusion-based robotic policies by filtering diffusion-generated actions with critic ensembles and enabling fleet-level collaboration.
-
TacCoRL: Integrating Tactile Feedback into VLA via Simulation
TacCoRL integrates tactile feedback into VLA policies via real-aligned simulation co-training and RL, raising average success from 50% to 72.5% on four bimanual contact-rich tasks with direct real-robot transfer.