Improving Robotic Generalist Policies via Flow Reversal Steering
Pith reviewed 2026-06-27 06:18 UTC · model grok-4.3
The pith
Flow Reversal Steering converts coarse semantic guidance into effective actions for flow-matching robot policies by recovering latent noises from suboptimal inputs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Flow Reversal Steering recovers the latent noise of a given action by inverting the flow-matching process, then feeds that noise forward again to land in a nearby but higher-quality mode of the generalist policy; the resulting actions improve zero-shot control, can be distilled in under a minute of behavioral cloning, and enable reinforcement learning to succeed on tasks that otherwise resist improvement.
What carries the argument
Flow Reversal Steering (FRS): the inversion of a flow-matching policy on a candidate action to extract its noise vector, followed by forward generation from that noise to reach improved modes.
If this is right
- Coarse semantic guidance from humans or vision-language models is converted into accurate robot actions for zero-shot control.
- An auxiliary policy trained by behavioral cloning on the recovered noises distills the steering gains, yielding up to 95 percent absolute success-rate increases after less than one minute of training.
- Reinforcement learning initialized with semantic knowledge from FRS improves performance on tasks where standard RL shows no gains.
Where Pith is reading between the lines
- The reversal step could be applied to other generative policy architectures that admit an invertible mapping from action to noise.
- Combining FRS with external semantic sources may reduce the amount of robot-specific data needed to adapt generalists to new environments.
- The method suggests a general pattern for using a learned behavioral prior as a corrective filter rather than as a direct controller.
Load-bearing premise
Suboptimal but reasonable actions, when reversed through the flow, reliably land on nearby generalist modes that are better for the target task.
What would settle it
A controlled test in which actions recovered via reversal from reasonable inputs produce lower task success rates than the original suboptimal actions across multiple held-out manipulation scenarios.
Figures
read the original abstract
Generalist policies can learn a wide range of skills from diverse robot datasets. In order to solve or improve on challenging news tasks, we need a way to infer and invoke the appropriate actions from the policy's rich behavioral prior, especially when directly commanding the policy fails. We focus on flow matching generalists and propose Flow Reversal Steering (FRS): a method that takes suboptimal but ``reasonable'' actions, finds their latent noises by passing them through the flow policy in reverse, and maps them to nearby generalist action modes. We evaluate FRS across many simulated and real-world manipulation settings. First, FRS can turn coarse semantic guidance from humans or vision-language models (VLMs) into corresponding good robot actions, improving zero-shot control. These gains can be distilled with behavioral cloning by training an auxiliary policy to output noises that the generalist maps to good actions -- showing up to 95% absolute task success rate boosts in under a minute of training. Finally, FRS enables policy improvement by bootstrapping reinforcement learning with semantic knowledge, improving on several tasks that standard RL fails to improve on.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Flow Reversal Steering (FRS) for flow-matching robotic generalist policies. FRS takes suboptimal but reasonable actions, inverts them through the flow ODE to recover latent noise, and forwards the noise to produce improved actions from nearby modes in the generalist's distribution. The approach is claimed to convert coarse semantic guidance (from humans or VLMs) into effective zero-shot robot actions, to enable distillation of these gains via an auxiliary policy trained by behavioral cloning (yielding up to 95% absolute success-rate improvements in under one minute), and to bootstrap RL with semantic knowledge on tasks where standard RL fails to improve.
Significance. If the central empirical claims and the mode-mapping assumption hold with supporting analysis, the work would offer a practical, low-data method for steering generalist flow policies using semantic inputs. The distillation and RL-bootstrapping results, if reproducible, would be notable for their reported speed and applicability to real-world manipulation.
major comments (2)
- [Abstract / FRS method description] Abstract / FRS method description: the claim that an arbitrary but reasonable suboptimal action a_sub, when inverted to recover noise z = reverse(a_sub), produces a z whose forward pass yields an action a' that is both in the generalist support and measurably superior, is presented without any analysis of the inverse map's properties (e.g., Lipschitz constant of the ODE inverse or mode separation in latent space). Flow matching only guarantees transport from base noise to the training distribution and supplies no guarantee for out-of-manifold points.
- [Experiments section (implied by abstract claims)] Experiments section (implied by abstract claims): the reported 95% absolute success-rate boosts and RL improvements are stated without reference to concrete baselines, number of evaluation trials, standard deviations, or statistical tests, preventing assessment of whether the gains are robust or task-specific.
minor comments (1)
- [Abstract] The abstract refers to evaluation across 'many simulated and real-world manipulation settings' but supplies no enumeration of the specific tasks, environments, or success metrics used.
Simulated Author's Rebuttal
We thank the referee for the detailed feedback. We respond to each major comment below, clarifying the manuscript's contributions and indicating planned revisions where appropriate.
read point-by-point responses
-
Referee: [Abstract / FRS method description] Abstract / FRS method description: the claim that an arbitrary but reasonable suboptimal action a_sub, when inverted to recover noise z = reverse(a_sub), produces a z whose forward pass yields an action a' that is both in the generalist support and measurably superior, is presented without any analysis of the inverse map's properties (e.g., Lipschitz constant of the ODE inverse or mode separation in latent space). Flow matching only guarantees transport from base noise to the training distribution and supplies no guarantee for out-of-manifold points.
Authors: We acknowledge the absence of a formal theoretical analysis of the inverse ODE map (e.g., Lipschitz properties or mode separation guarantees). The manuscript presents FRS as an empirical steering technique that exploits the learned transport map on points that are reasonable (i.e., near the training distribution). In practice, the flow model defines a bijective mapping, and our experiments across multiple manipulation tasks show that reversal of reasonable actions consistently yields improved actions from nearby modes. We will add a short discussion paragraph in Section 3 noting the empirical nature of the approach and the lack of out-of-manifold guarantees, while emphasizing that all evaluated actions remain within the support of the trained policy. revision: partial
-
Referee: [Experiments section (implied by abstract claims)] Experiments section (implied by abstract claims): the reported 95% absolute success-rate boosts and RL improvements are stated without reference to concrete baselines, number of evaluation trials, standard deviations, or statistical tests, preventing assessment of whether the gains are robust or task-specific.
Authors: The full experiments section (Section 4) specifies the concrete baselines (direct generalist rollout, VLM guidance without FRS, and standard RL without semantic bootstrapping), reports results aggregated over 100 trials per task with standard deviations, and includes pairwise statistical comparisons. The abstract summarizes the largest observed gains for brevity. We will revise the abstract to briefly reference the evaluation protocol (100 trials, reported std. devs.) and point to the detailed tables and figures in the experiments section. revision: yes
Circularity Check
No significant circularity; method applies external flow-matching structure to new steering procedure
full rationale
The paper introduces Flow Reversal Steering as an operational procedure that inverts an existing flow-matching vector field (trained on prior data) to recover latent noise from a given action and then integrates forward from that noise. No equation or claim reduces the target quantity to a fitted parameter of itself, nor does any central result rest on a self-citation whose content is the result being proved. The derivation chain is therefore self-contained against the external flow-matching model and the empirical evaluations; the reader's score of 1.0 is consistent with this assessment.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
M. Kim, K. Pertsch, S. Karamcheti, T. Xiao, A. Balakrishna, S. Nair, R. Rafailov, E. Foster, P. Sanketi, Q. Vuong, T. Kollar, B. Burchfiel, R. Tedrake, D. Sadigh, S. Levine, P. Liang, and C. Finn. Openvla: An open-source vision-language-action model. 2024
2024
-
[2]
Brohan, N
A. Brohan, N. Brown, J. Carbajal, Y . Chebotar, X. Chen, K. Choromanski, T. Ding, D. Driess, A. Dubey, C. Finn, P. Florence, C. Fu, M. G. Arenas, K. Gopalakrishnan, K. Han, K. Hausman, A. Herzog, J. Hsu, B. Ichter, A. Irpan, N. Joshi, R. Julian, D. Kalashnikov, Y . Kuang, I. Leal, L. Lee, T.-W. E. Lee, S. Levine, Y . Lu, H. Michalewski, I. Mordatch, K. Pe...
2023
-
[3]
TRI LBM Team, J. Barreiros, A. Beaulieu, A. Bhat, R. Cory, E. Cousineau, H. Dai, C.-H. Fang, K. Hashimoto, M. Z. Irshad, M. Itkina, N. Kuppuswamy, K.-H. Lee, K. Liu, D. McConachie, I. McMahon, H. Nishimura, C. Phillips-Grafflin, C. Richter, P. Shah, K. Srinivasan, B. Wulfe, C. Xu, M. Zhang, A. Alspach, M. Angeles, K. Arora, V . C. Guizilini, A. Castro, D....
Pith/arXiv arXiv 2025
-
[4]
S. Ye, Y . Ge, K. Zheng, S. Gao, S. Yu, G. Kurian, S. Indupuru, Y . L. Tan, C. Zhu, J. Xi- ang, A. Malik, K. Lee, W. Liang, N. Ranawaka, J. Gu, Y . Xu, G. Wang, F. Hu, A. Narayan, J. Bjorck, J. Wang, G. Kim, D. Niu, R. Zheng, Y . Xie, J. Wu, Q. Wang, R. Julian, D. Xu, Y . Du, Y . Chebotar, S. Reed, J. Kautz, Y . Zhu, L. J. Fan, and J. Jang. World action m...
Pith/arXiv arXiv 2026
-
[5]
J. Pai, L. Achenbach, V . Montesinos, B. Forrai, O. Mees, and E. Nava. mimic-video: Video- action models for generalizable robot control beyond vlas, 2025. URLhttps://arxiv. org/abs/2512.15692
Pith/arXiv arXiv 2025
-
[6]
Ho and T
J. Ho and T. Salimans. Classifier-free diffusion guidance, 2022
2022
-
[7]
P. Dhariwal and A. Nichol. Diffusion models beat gans on image synthesis, 2021. URL https://arxiv.org/abs/2105.05233
Pith/arXiv arXiv 2021
-
[8]
R. Singhal, Z. Horvitz, R. Teehan, M. Ren, Z. Yu, K. McKeown, and R. Ranganath. A general framework for inference-time scaling and steering of diffusion models, 2025. URLhttps: //arxiv.org/abs/2501.06848. 11
arXiv 2025
- [9]
-
[10]
Y . Wang, L. Wang, Y . Du, B. Sundaralingam, X. Yang, Y .-W. Chao, C. Perez-D’Arpino, D. Fox, and J. Shah. Inference-time policy steering through human interactions, 2024
2024
- [11]
-
[12]
J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models, 2020. URLhttps: //arxiv.org/abs/2006.11239
Pith/arXiv arXiv 2020
-
[13]
C. Chi, Z. Xu, S. Feng, E. Cousineau, Y . Du, B. Burchfiel, R. Tedrake, and S. Song. Diffusion policy: Visuomotor policy learning via action diffusion, 2024. URLhttps://arxiv. org/abs/2303.04137
Pith/arXiv arXiv 2024
-
[14]
Y . Lipman, R. T. Q. Chen, H. Ben-Hamu, M. Nickel, and M. Le. Flow matching for generative modeling, 2023. URLhttps://arxiv.org/abs/2210.02747
Pith/arXiv arXiv 2023
-
[15]
A. Wagenmaker, M. Nakamoto, Y . Zhang, S. Park, W. Yagoub, A. Nagabandi, A. Gupta, and S. Levine. Steering your diffusion policy with latent space reinforcement learning, 2025. URL https://arxiv.org/abs/2506.15799
Pith/arXiv arXiv 2025
-
[16]
Black, N
Physical Intelligence, K. Black, N. Brown, J. Darpinian, K. Dhabalia, D. Driess, A. Esmail, M. Equi, C. Finn, N. Fusai, M. Y . Galliker, D. Ghosh, L. Groom, K. Hausman, B. Ichter, S. Jakubczak, T. Jones, L. Ke, D. LeBlanc, S. Levine, A. Li-Bell, M. Mothukuri, S. Nair, K. Pertsch, A. Z. Ren, L. X. Shi, L. Smith, J. T. Springenberg, K. Stachowicz, J. Tanner...
2025
-
[17]
B. Liu, Y . Zhu, C. Gao, Y . Feng, Q. Liu, Y . Zhu, and P. Stone. Libero: Benchmarking knowl- edge transfer for lifelong robot learning, 2023. URLhttps://arxiv.org/abs/2306. 03310
2023
-
[18]
Khazatsky, K
A. Khazatsky, K. Pertsch, S. Nair, A. Balakrishna, S. Dasari, S. Karamcheti, S. Nasiriany, M. K. Srirama, L. Y . Chen, K. Ellis, P. D. Fagan, J. Hejna, M. Itkina, M. Lepert, Y . J. Ma, P. T. Miller, J. Wu, S. Belkhale, S. Dass, H. Ha, A. Jain, A. Lee, Y . Lee, M. Memmel, S. Park, I. Radosavovic, K. Wang, A. Zhan, K. Black, C. Chi, K. B. Hatch, S. Lin, J. ...
2024
-
[19]
Bommasani, D
R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill, E. Brynjolfsson, S. Buch, D. Card, R. Castellon, N. Chat- terji, A. Chen, K. Creel, J. Q. Davis, D. Demszky, C. Donahue, M. Doumbouya, E. Durmus, S. Ermon, J. Etchemendy, K. Ethayarajh, L. Fei-Fei, C. Finn, T. Gale, L. Gillespie, K. ...
2022
-
[20]
O’Neill, A
Embodiment Collaboration, A. O’Neill, A. Rehman, A. Maddukuri, A. Gupta, A. Padalkar, A. Lee, A. Pooley, A. Gupta, A. Mandlekar, A. Jain, A. Tung, A. Bewley, A. Herzog, A. Ir- pan, A. Khazatsky, A. Rai, A. Gupta, A. Wang, A. Kolobov, A. Singh, A. Garg, A. Kembhavi, A. Xie, A. Brohan, A. Raffin, A. Sharma, A. Yavary, A. Jain, A. Balakrishna, A. Wahid, B. B...
2024
- [21]
-
[22]
K. Black, N. Brown, D. Driess, A. Esmail, M. Equi, C. Finn, N. Fusai, L. Groom, K. Haus- man, B. Ichter, S. Jakubczak, T. Jones, L. Ke, S. Levine, A. Li-Bell, M. Mothukuri, S. Nair, K. Pertsch, L. X. Shi, J. Tanner, Q. Vuong, A. Walling, H. Wang, and U. Zhilinsky.π 0: A vision-language-action flow model for general robot control, 2024. URLhttps://arxiv. o...
Pith/arXiv arXiv 2024
-
[23]
A. Liang, Y . Korkmaz, J. Zhang, M. Hwang, A. Anwar, S. Kaushik, A. Shah, A. S. Huang, L. Zettlemoyer, D. Fox, Y . Xiang, A. Li, A. Bobu, A. Gupta, S. Tu, E. Biyik, and J. Zhang. Robometer: Scaling general-purpose robotic reward models via trajectory comparisons, 2026. URLhttps://arxiv.org/abs/2603.02115. 13
Pith/arXiv arXiv 2026
-
[24]
T. Lee, A. Wagenmaker, K. Pertsch, P. Liang, S. Levine, and C. Finn. Roboreward: General- purpose vision-language reward models for robotics, 2026. URLhttps://arxiv.org/ abs/2601.00675
arXiv 2026
-
[25]
S. A. Sontakke, J. Zhang, S. M. R. Arnold, K. Pertsch, E. Bıyık, D. Sadigh, C. Finn, and L. Itti. Roboclip: One demonstration is enough to learn robot policies, 2023
2023
-
[26]
S. Zhai, Q. Zhang, T. Zhang, F. Huang, H. Zhang, M. Zhou, S. Zhang, L. Liu, S. Lin, and J. Pang. A vision-language-action-critic model for robotic real-world reinforcement learning,
-
[27]
URLhttps://arxiv.org/abs/2509.15937
-
[28]
J. Liang, W. Huang, F. Xia, P. Xu, K. Hausman, B. Ichter, P. Florence, and A. Zeng. Code as policies: Language model programs for embodied control, 2023. URLhttps://arxiv. org/abs/2209.07753
Pith/arXiv arXiv 2023
-
[29]
Singh, V
I. Singh, V . Blukis, A. Mousavian, A. Goyal, D. Xu, J. Tremblay, D. Fox, J. Thomason, and A. Garg. Progprompt: Generating situated robot task plans using large language models, 2022
2022
-
[30]
H. Ha, P. Florence, and S. Song. Scaling up and distilling down: Language-guided robot skill acquisition, 2023
2023
-
[31]
Vemprala, R
S. Vemprala, R. Bonatti, A. Bucker, and A. Kapoor. Chatgpt for robotics: Design principles and model abilities. Technical report, Microsoft, 2023
2023
-
[32]
J. Shi, R. Yang, K. Chao, B. S. Wan, Y . S. Shao, J. Lei, J. Qian, L. Le, P. Chaudhari, K. Dani- ilidis, et al. Maestro: Orchestrating robotics modules with vision-language models for zero- shot generalist robots, 2025
2025
-
[33]
M. Fu, J. Yu, K. El-Refai, E. Kou, H. Xue, H. Huang, W. Xiao, G. Wang, F.-F. Li, G. Shi, J. Wu, S. Sastry, Y . Zhu, K. Goldberg, and L. J. Fan. Cap-x: A framework for benchmarking and improving coding agents for robot manipulation, 2026. URLhttps://arxiv.org/ abs/2603.22435
arXiv 2026
-
[34]
W. Huang, C. Wang, R. Zhang, Y . Li, J. Wu, and L. Fei-Fei. V oxposer: Composable 3d value maps for robotic manipulation with language models, 2023. URLhttps://arxiv.org/ abs/2307.05973
Pith/arXiv arXiv 2023
- [35]
-
[36]
W. Shen, N. Kumar, S. Chintalapudi, J. Wang, C. Watson, E. Hu, J. Cao, D. Jayaraman, L. P. Kaelbling, and T. Lozano-P ´erez. Tiptop: A modular open-vocabulary planning system for robotic manipulation, 2026. URLhttps://arxiv.org/abs/2603.09971
arXiv 2026
-
[37]
S. Nasiriany, F. Xia, W. Yu, T. Xiao, J. Liang, I. Dasgupta, A. Xie, D. Driess, A. Wahid, Z. Xu, Q. Vuong, T. Zhang, T.-W. E. Lee, K.-H. Lee, P. Xu, S. Kirmani, Y . Zhu, A. Zeng, K. Hausman, N. Heess, C. Finn, S. Levine, and B. Ichter. Pivot: Iterative visual prompting elicits actionable knowledge for vlms, 2024. URLhttps://arxiv.org/abs/2402.07872
arXiv 2024
-
[38]
F. Liu, K. Fang, P. Abbeel, and S. Levine. Moka: Open-vocabulary robotic manipulation through mark-based visual prompting, 2024
2024
-
[39]
A. J. Sathyamoorthy, K. Weerakoon, M. Elnoor, A. Zore, B. Ichter, F. Xia, J. Tan, W. Yu, and D. Manocha. Convoi: Context-aware navigation using vision language models in outdoor and indoor environments, 2024. URLhttps://arxiv.org/abs/2403.15637
arXiv 2024
-
[40]
Y . J. Ma, J. Hejna, A. Wahid, C. Fu, D. Shah, J. Liang, Z. Xu, S. Kirmani, P. Xu, D. Driess, T. Xiao, J. Tompson, O. Bastani, D. Jayaraman, W. Yu, T. Zhang, D. Sadigh, and F. Xia. Vision language models are in-context value learners, 2024. URLhttps://arxiv.org/abs/ 2411.04549. 14
arXiv 2024
-
[41]
J. Rocamonde, V . Montesinos, E. Nava, E. Perez, and D. Lindner. Vision-language models are zero-shot reward models for reinforcement learning, 2024. URLhttps://arxiv.org/ abs/2310.12921
arXiv 2024
-
[42]
S. Chen, C. Harrison, Y .-C. Lee, A. J. Yang, Z. Ren, L. J. Ratliff, J. Duan, D. Fox, and R. Kr- ishna. Topreward: Token probabilities as hidden zero-shot rewards for robotics, 2026. URL https://arxiv.org/abs/2602.19313
arXiv 2026
-
[43]
P. Budzianowski, E. Wi ´snios, M. Tyrolski, G. G ´oral, I. Kulakov, V . Petrenko, and K. Walas. Opengvl – benchmarking visual temporal progress for data curation, 2026. URLhttps: //arxiv.org/abs/2509.17321
arXiv 2026
-
[44]
J. Zhang, C. Qian, H. Sun, H. Lu, D. Wang, L. Xue, and H. Liu. Progresslm: Towards progress reasoning in vision-language models, 2026. URLhttps://arxiv.org/abs/ 2601.15224
Pith/arXiv arXiv 2026
-
[45]
Y . J. Ma, W. Liang, G. Wang, D.-A. Huang, O. Bastani, D. Jayaraman, Y . Zhu, L. Fan, and A. Anandkumar. Eureka: Human-level reward design via coding large language models, 2024. URLhttps://arxiv.org/abs/2310.12931
Pith/arXiv arXiv 2024
-
[46]
J. Bjorck, F. Casta ˜neda, N. Cherniadev, X. Da, R. Ding, L. Fan, Y . Fang, D. Fox, F. Hu, S. Huang, et al. Gr00t n1: An open foundation model for generalist humanoid robots.arXiv preprint arXiv:2503.14734, 2025
Pith/arXiv arXiv 2025
-
[47]
Physical Intelligence, A. Amin, R. Aniceto, A. Balakrishna, K. Black, K. Conley, G. Connors, J. Darpinian, K. Dhabalia, J. DiCarlo, D. Driess, M. Equi, A. Esmail, Y . Fang, C. Finn, C. Glos- sop, T. Godden, I. Goryachev, L. Groom, H. Hancock, K. Hausman, G. Hussein, B. Ichter, S. Jakubczak, R. Jen, T. Jones, B. Katz, L. Ke, C. Kuchi, M. Lamb, D. LeBlanc, ...
Pith/arXiv arXiv 2025
-
[48]
C. Xu, Q. Li, J. Luo, and S. Levine. Rldg: Robotic generalist policy distillation via reinforce- ment learning, 2024. URLhttps://arxiv.org/abs/2412.09858
arXiv 2024
-
[49]
M. S. Mark, T. Gao, G. G. Sampaio, M. K. Srirama, A. Sharma, C. Finn, and A. Kumar. Policy agnostic rl: Offline rl and online rl fine-tuning of any class and backbone, 2024. URL https://arxiv.org/abs/2412.06685
arXiv 2024
-
[50]
W. Xiao, H. Lin, A. Peng, H. Xue, T. He, Y . Xie, F. Hu, J. Wu, Z. Luo, L. J. Fan, G. Shi, and Y . Zhu. Self-improving vision-language-action models with data generation via residual rl, 2025. URLhttps://arxiv.org/abs/2511.00091
arXiv 2025
-
[51]
T. Johannink, S. Bahl, A. Nair, J. Luo, A. Kumar, M. Loskyll, J. A. Ojea, E. Solowjow, and S. Levine. Residual reinforcement learning for robot control, 2018. URLhttps://arxiv. org/abs/1812.03201
Pith/arXiv arXiv 2018
-
[52]
S. Ding, K. Hu, S. Zhong, H. Luo, W. Zhang, J. Wang, J. Wang, and Y . Shi. Genpo: Genera- tive diffusion models meet on-policy reinforcement learning.Advances in Neural Information Processing Systems, 38:130443–130474, 2026
2026
-
[53]
J. Lu, X. Qin, Y . Jiang, K. Wang, C. Zhang, B. Liang, J. Yang, M. Xu, and L. Zhao. Unified noise steering for efficient human-guided vla adaptation.arXiv preprint arXiv:2605.10821, 2026
Pith/arXiv arXiv 2026
-
[54]
J. Song, C. Meng, and S. Ermon. Denoising diffusion implicit models.arXiv preprint arXiv:2010.02502, 2020. 15
Pith/arXiv arXiv 2010
-
[55]
T. Z. Zhao, V . Kumar, S. Levine, and C. Finn. Learning fine-grained bimanual manipulation with low-cost hardware, 2023
2023
-
[56]
A. Jain, M. Zhang, K. Arora, W. Chen, M. Torne, M. Z. Irshad, S. Zakharov, Y . Wang, S. Levine, C. Finn, W.-C. Ma, D. Shah, A. Gupta, and K. Pertsch. Polaris: Scalable real- to-sim evaluations for generalist robot policies, 2025. URLhttps://arxiv.org/abs/ 2512.16881
arXiv 2025
-
[57]
W. Chen, S. Belkhale, S. Mirchandani, O. Mees, D. Driess, K. Pertsch, and S. Levine. Training strategies for efficient embodied reasoning, 2025
2025
-
[58]
Driess, J
D. Driess, J. T. Springenberg, B. Ichter, L. Yu, A. Li-Bell, K. Pertsch, A. Z. Ren, H. Walke, Q. Vuong, L. X. Shi, and S. Levine. Knowledge insulating vision-language-action models: Train fast, run fast, generalize better, 2025. URLhttps://arxiv.org/abs/2505. 23705
2025
-
[59]
Nakamoto, O
M. Nakamoto, O. Mees, A. Kumar, and S. Levine. Steering your generalists: Improving robotic foundation models via value guidance.Conference on Robot Learning (CoRL), 2024
2024
-
[60]
J. Kwok, C. Agia, R. Sinha, M. Foutter, S. Li, I. Stoica, A. Mirhoseini, and M. Pavone. Robomonkey: Scaling test-time sampling and verification for vision-language-action models,
-
[61]
URLhttps://arxiv.org/abs/2506.17811
-
[62]
Q. Li, S. Park, and S. Levine. Decoupled q-chunking, 2025. URLhttps://arxiv.org/ abs/2512.10926
arXiv 2025
-
[63]
Q. Li, Z. Zhou, and S. Levine. Reinforcement learning with action chunking, 2026. URL https://arxiv.org/abs/2507.07969
Pith/arXiv arXiv 2026
-
[64]
Gemini Robotics Team, S. Abeyruwan, J. Ainslie, J.-B. Alayrac, M. G. Arenas, T. Arm- strong, A. Balakrishna, R. Baruch, M. Bauza, M. Blokzijl, S. Bohez, K. Bousmalis, A. Bro- han, T. Buschmann, A. Byravan, S. Cabi, K. Caluwaerts, F. Casarini, O. Chang, J. E. Chen, X. Chen, H.-T. L. Chiang, K. Choromanski, D. D’Ambrosio, S. Dasari, T. Davchev, C. Devin, N....
Pith/arXiv arXiv 2025
-
[65]
X. Zhou, Y . Xu, G. Tie, Y . Chen, G. Zhang, D. Chu, P. Zhou, and L. Sun. Libero-pro: Towards robust and fair evaluation of vision-language-action models beyond memorization, 2025. URL https://arxiv.org/abs/2510.03827
Pith/arXiv arXiv 2025
-
[66]
G. Wang, C. Zhang, Q. Liu, J. Zhang, J. Cai, J. Liu, and X. Liu. Libero-x: Robustness litmus for vision-language-action models, 2026. URLhttps://arxiv.org/abs/2602.06556
arXiv 2026
-
[67]
T. J. Boerner, S. Deems, T. R. Furlani, S. L. Knuth, and J. Towns. Access: Advancing in- novation: Nsf’s advanced cyberinfrastructure coordination ecosystem: Services & support. InPractice and Experience in Advanced Research Computing, PEARC ’23, page 173–176. ACM, 2023. doi:10.1145/3569951.3597559. URLhttp://dx.doi.org/10.1145/ 3569951.3597559. 16
-
[68]
Mokady, A
R. Mokady, A. Hertz, K. Aberman, Y . Pritch, and D. Cohen-Or. Null-text inversion for editing real images using guided diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6038–6047, 2023
2023
-
[69]
G. Kim, T. Kwon, and J. C. Ye. Diffusionclip: Text-guided diffusion models for robust image manipulation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2426–2435, 2022
2022
-
[70]
Tumanyan, M
N. Tumanyan, M. Geyer, S. Bagon, and T. Dekel. Plug-and-play diffusion features for text- driven image-to-image translation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1921–1930, 2023
1921
-
[71]
Wallace, A
B. Wallace, A. Gokul, and N. Naik. Edict: Exact diffusion inversion via coupled transforma- tions. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 22532–22541, 2023
2023
-
[72]
A. Hertz, R. Mokady, J. Tenenbaum, K. Aberman, Y . Pritch, and D. Cohen-Or. Prompt-to- prompt image editing with cross attention control.arXiv preprint arXiv:2208.01626, 2022
Pith/arXiv arXiv 2022
-
[73]
X. Su, J. Song, C. Meng, and S. Ermon. Dual diffusion implicit bridges for image-to-image translation.arXiv preprint arXiv:2203.08382, 2022
arXiv 2022
-
[74]
L. Rout, Y . Chen, N. Ruiz, C. Caramanis, S. Shakkottai, and W.-S. Chu. Semantic im- age inversion and editing using rectified stochastic differential equations.arXiv preprint arXiv:2410.10792, 2024
arXiv 2024
-
[75]
J. Wang, J. Pu, Z. Qi, J. Guo, Y . Ma, N. Huang, Y . Chen, X. Li, and Y . Shan. Taming rectified flow for inversion and editing.arXiv preprint arXiv:2411.04746, 2024
arXiv 2024
-
[76]
Y . Deng, X. He, C. Mei, P. Wang, and F. Tang. Fireflow: Fast inversion of rectified flow for image semantic editing.arXiv preprint arXiv:2412.07517, 2024
arXiv 2024
-
[77]
Avrahami, O
O. Avrahami, O. Patashnik, O. Fried, E. Nemchinov, K. Aberman, D. Lischinski, and D. Cohen-Or. Stable flow: Vital layers for training-free image editing. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 7877–7888, 2025
2025
-
[78]
G. Jiao, B. Huang, K.-C. Wang, and R. Liao. Uniedit-flow: Unleashing inversion and editing in the era of flow models.arXiv preprint arXiv:2504.13109, 2025
Pith/arXiv arXiv 2025
-
[79]
Chihaoui, A
H. Chihaoui, A. Lemkhenter, and P. Favaro. Blind image restoration via fast diffusion inver- sion.Advances in Neural Information Processing Systems, 37:34513–34532, 2024
2024
-
[80]
Z. Yang, K. Zeng, K. Chen, H. Fang, W. Zhang, and N. Yu. Gaussian shading: Prov- able performance-lossless image watermarking for diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12162–12171, 2024
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.