Canonical reference

Steering your generalists: Improving robotic foundation models via value guidance

· 2024 · arXiv 2410.13816

Canonical reference. 86% of citing Pith papers cite this work as background.

9 Pith papers citing it

Background 86% of classified citations

read on arXiv browse 9 citing papers

citation-role summary

background 6 method 1

citation-polarity summary

background 6 use method 1

representative citing papers

Offline Policy Evaluation for Manipulation Policies via Discounted Liveness Formulation

cs.RO · 2026-05-12 · conditional · novelty 7.0

A liveness-based Bellman operator enables conservative offline policy evaluation for manipulation tasks by encoding task progression and reducing truncation bias from finite horizons.

Agent-Centric Observation Adaptation for Robust Visual Control under Dynamic Perturbations

cs.RO · 2026-04-27 · unverdicted · novelty 7.0 · 2 refs

ACO-MoE recovers 95.3% of clean-input performance in visual control tasks under Markov-switching corruptions by routing restoration experts and anchoring representations to clean foreground masks.

Action Images: End-to-End Policy Learning via Multiview Video Generation

cs.CV · 2026-04-07 · unverdicted · novelty 7.0

Action Images turn robot arm motions into interpretable multiview pixel videos, letting video backbones serve as zero-shot policies for end-to-end robot learning.

You've Got a Golden Ticket: Improving Generative Robot Policies With A Single Noise Vector

cs.RO · 2026-03-16 · conditional · novelty 7.0

Optimizing a single constant initial noise vector for frozen generative robot policies improves success rates on 38 of 43 tasks by up to 58% relative improvement.

Steering Your Diffusion Policy with Latent Space Reinforcement Learning

cs.RO · 2025-06-18 · unverdicted · novelty 7.0

DSRL steers pretrained diffusion policies for robotics by applying RL to their latent noise inputs, achieving sample-efficient real-world adaptation with only black-box access.

VLA-ATTC: Adaptive Test-Time Compute for VLA Models with Relative Action Critic Model

cs.RO · 2026-05-02 · unverdicted · novelty 6.0

VLA-ATTC equips VLA models with adaptive test-time compute via an uncertainty clutch and relative action critic, cutting failure rates by over 50% on LIBERO-LONG.

Breaking Lock-In: Preserving Steerability under Low-Data VLA Post-Training

cs.RO · 2026-04-25 · unverdicted · novelty 6.0

DeLock mitigates lock-in in low-data VLA post-training via visual grounding preservation and test-time contrastive prompt guidance, outperforming baselines across eight evaluations while matching data-heavy generalist policies.

FASTER: Value-Guided Sampling for Fast RL

cs.LG · 2026-04-21 · unverdicted · novelty 6.0

FASTER models multi-candidate denoising as an MDP and trains a value function to filter actions early, delivering the performance of full sampling at lower cost in diffusion RL policies.

Sumo: Dynamic and Generalizable Whole-Body Loco-Manipulation

cs.RO · 2026-04-09 · unverdicted · novelty 6.0

Test-time steering of pre-trained whole-body policies via sample-based planning lets legged robots generalize dynamic loco-manipulation to varied heavy objects and tasks without additional training or tuning.

citing papers explorer

Showing 9 of 9 citing papers.

Offline Policy Evaluation for Manipulation Policies via Discounted Liveness Formulation cs.RO · 2026-05-12 · conditional · none · ref 19
A liveness-based Bellman operator enables conservative offline policy evaluation for manipulation tasks by encoding task progression and reducing truncation bias from finite horizons.
Agent-Centric Observation Adaptation for Robust Visual Control under Dynamic Perturbations cs.RO · 2026-04-27 · unverdicted · none · ref 42 · 2 links
ACO-MoE recovers 95.3% of clean-input performance in visual control tasks under Markov-switching corruptions by routing restoration experts and anchoring representations to clean foreground masks.
Action Images: End-to-End Policy Learning via Multiview Video Generation cs.CV · 2026-04-07 · unverdicted · none · ref 42
Action Images turn robot arm motions into interpretable multiview pixel videos, letting video backbones serve as zero-shot policies for end-to-end robot learning.
You've Got a Golden Ticket: Improving Generative Robot Policies With A Single Noise Vector cs.RO · 2026-03-16 · conditional · none · ref 24
Optimizing a single constant initial noise vector for frozen generative robot policies improves success rates on 38 of 43 tasks by up to 58% relative improvement.
Steering Your Diffusion Policy with Latent Space Reinforcement Learning cs.RO · 2025-06-18 · unverdicted · none · ref 54
DSRL steers pretrained diffusion policies for robotics by applying RL to their latent noise inputs, achieving sample-efficient real-world adaptation with only black-box access.
VLA-ATTC: Adaptive Test-Time Compute for VLA Models with Relative Action Critic Model cs.RO · 2026-05-02 · unverdicted · none · ref 15
VLA-ATTC equips VLA models with adaptive test-time compute via an uncertainty clutch and relative action critic, cutting failure rates by over 50% on LIBERO-LONG.
Breaking Lock-In: Preserving Steerability under Low-Data VLA Post-Training cs.RO · 2026-04-25 · unverdicted · none · ref 57
DeLock mitigates lock-in in low-data VLA post-training via visual grounding preservation and test-time contrastive prompt guidance, outperforming baselines across eight evaluations while matching data-heavy generalist policies.
FASTER: Value-Guided Sampling for Fast RL cs.LG · 2026-04-21 · unverdicted · none · ref 15
FASTER models multi-candidate denoising as an MDP and trains a value function to filter actions early, delivering the performance of full sampling at lower cost in diffusion RL policies.
Sumo: Dynamic and Generalizable Whole-Body Loco-Manipulation cs.RO · 2026-04-09 · unverdicted · none · ref 33
Test-time steering of pre-trained whole-body policies via sample-based planning lets legged robots generalize dynamic loco-manipulation to varied heavy objects and tasks without additional training or tuning.

Steering your generalists: Improving robotic foundation models via value guidance

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer