UPS framework uses conformal prediction to calibrate VLM verifiers for choosing between high-confidence action execution, natural language task queries, or policy interventions, then applies residual learning from interventions to continually improve the base policy with minimal feedback.
Arxiv Preprint Arxiv:2510.16281 , year=
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.RO 4years
2026 4verdicts
UNVERDICTED 4roles
background 2polarities
background 2representative citing papers
Interventional attribution via ISS and NMR diagnoses causal misalignment in VLA policies and predicts their generalization performance across manipulation tasks.
SV-VLA uses infrequent heavy VLA planning of action chunks plus a lightweight closed-loop verifier to achieve both efficiency and robustness in dynamic robot control.
R&B-EnCoRe uses self-supervised importance-weighted variational inference to distill action-predictive reasoning datasets that improve VLA performance on manipulation, navigation, and driving tasks without external verifiers.
citing papers explorer
-
When to Act, Ask, or Learn: Uncertainty-Aware Policy Steering
UPS framework uses conformal prediction to calibrate VLM verifiers for choosing between high-confidence action execution, natural language task queries, or policy interventions, then applies residual learning from interventions to continually improve the base policy with minimal feedback.
-
Embodied Interpretability: Linking Causal Understanding to Generalization in Vision-Language-Action Models
Interventional attribution via ISS and NMR diagnoses causal misalignment in VLA policies and predicts their generalization performance across manipulation tasks.
-
Open-Loop Planning, Closed-Loop Verification: Speculative Verification for VLA
SV-VLA uses infrequent heavy VLA planning of action chunks plus a lightweight closed-loop verifier to achieve both efficiency and robustness in dynamic robot control.
-
Self-Supervised Bootstrapping of Action-Predictive Embodied Reasoning
R&B-EnCoRe uses self-supervised importance-weighted variational inference to distill action-predictive reasoning datasets that improve VLA performance on manipulation, navigation, and driving tasks without external verifiers.