hub

Wmpo: World model-based policy optimization for vision-language-action models

Fangqi Zhu, Zhengyang Yan, Zicong Hong, Quanxin Shou, Xiao Ma, Song Guo · 2025 · arXiv 2511.09515

17 Pith papers cite this work. Polarity classification is still indexing.

17 Pith papers citing it

read on arXiv browse 17 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 method 1

citation-polarity summary

background 3 use method 1

representative citing papers

Point Tracking Improves World Action Models

cs.RO · 2026-05-22 · unverdicted · novelty 7.0

JOPAT jointly models pixels, point tracks, and actions in a diffusion transformer and reports gains over pixel-only baselines on long-horizon robot tasks with occlusion and off-screen motion.

Learn Where Outcomes Diverge: Efficient VLA RL via Probabilistic Chunk Masking

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

PCM uses success-failure action variance to probabilistically select and mask chunks for gradient updates in GRPO, matching standard success rates with 2.38x wall-clock speedup and 60% lower memory on LIBERO benchmarks.

MultiWorld: Scalable Multi-Agent Multi-View Video World Models

cs.CV · 2026-04-20 · unverdicted · novelty 7.0

MultiWorld is a scalable framework for multi-agent multi-view video world models that improves controllability and consistency over single-agent baselines in game and robot tasks.

PlayWorld: Learning Robot World Models from Autonomous Play

cs.RO · 2026-03-09 · unverdicted · novelty 7.0

PlayWorld learns high-fidelity robot world models from unsupervised self-play, producing physically consistent video predictions that outperform models trained on human data and enabling 65% better real-world policy performance via model-based RL.

DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos

cs.RO · 2026-02-06 · unverdicted · novelty 7.0

DreamDojo is a foundation world model pretrained on the largest human video dataset to date that uses continuous latent actions to transfer interaction knowledge and achieves controllable physics simulation after robot post-training.

Reinforcing VLAs in Task-Agnostic World Models

cs.AI · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

RAW-Dream disentangles world-model learning from task data by using a pre-trained task-agnostic world model and VLM rewards, with dual-noise filtering, to enable zero-shot VLA adaptation in simulation and real settings.

Escaping the Diversity Trap in Robotic Manipulation via Anchor-Centric Adaptation

cs.RO · 2026-05-08 · unverdicted · novelty 6.0

Anchor-Centric Adaptation escapes the diversity trap by prioritizing repeated demonstrations at core anchors over broad coverage, yielding higher success rates under fixed data budgets in robotic manipulation.

RoboStereo: Dual-Tower 4D Embodied World Models for Unified Policy Optimization

cs.CV · 2026-03-13 · unverdicted · novelty 6.0

A dual-tower 4D embodied world model called RoboStereo reduces geometric hallucinations and delivers over 97% relative improvement on manipulation tasks via test-time augmentation, imitative learning, and open exploration.

RISE: Self-Improving Robot Policy with Compositional World Model

cs.RO · 2026-02-11 · unverdicted · novelty 6.0

RISE combines a controllable dynamics model and progress value model into a closed-loop self-improving pipeline that updates robot policies entirely in imagination, reporting over 35% absolute gains on three real-world tasks.

Towards Long-Lived Robots: Continual Learning VLA Models via Reinforcement Fine-Tuning

cs.RO · 2026-02-11 · unverdicted · novelty 6.0

LifeLong-RFT applies chunking-level on-policy reinforcement learning with Quantized Action Consistency Reward, Continuous Trajectory Alignment Reward, and Format Compliance Reward to fine-tune VLA models, achieving a 22% average success rate gain over supervised fine-tuning on the LIBERO benchmark's

WorldArena 2.0: Extending Embodied World Model Benchmarking on Modality, Functionality and Platform

cs.RO · 2026-05-18 · unverdicted · novelty 5.0

WorldArena 2.0 extends embodied world model benchmarks to visuotactile perception, interactive policy training, and diverse real and simulated robotic platforms under a unified protocol.

Sword: Style-Robust World Models as Simulators via Dynamic Latent Bootstrapping for VLA Policy Post-Training

cs.CV · 2026-05-08 · unverdicted · novelty 5.0

Sword improves world model simulators for VLA policies by disentangling visual style from dynamics and bootstrapping latents for better consistency, outperforming baselines on LIBERO in generalization and RL post-training success.

Reconstruction or Semantics? What Makes a Latent Space Useful for Robotic World Models

cs.CV · 2026-05-07 · unverdicted · novelty 5.0

Semantic latent spaces from pretrained encoders outperform reconstruction-based spaces for robotic world models on planning and downstream policy performance.

STARRY: Spatial-Temporal Action-Centric World Modeling for Robotic Manipulation

cs.RO · 2026-04-29 · unverdicted · novelty 5.0

STARRY uses unified diffusion to align spatial-temporal world predictions with action generation plus GASAM for geometry-aware attention, reaching 93.82%/93.30% success on 50 bimanual tasks in simulation and raising real-world success from 42.5% to 70.8%.

World-Value-Action Model: Implicit Planning for Vision-Language-Action Systems

cs.RO · 2026-04-16 · unverdicted · novelty 5.0

The World-Value-Action model enables implicit planning for VLA systems by performing inference over a learned latent representation of high-value future trajectories instead of direct action prediction.

Pre-VLA: Preemptive Runtime Verification for Reliable Vision-Language-Action and World-Model Rollouts

cs.CV · 2026-05-21 · unverdicted · novelty 4.0

Pre-VLA is a multimodal runtime verifier that predicts safety confidence and advantage scores for action chunks, raising closed-loop success rates on the LIBERO benchmark from 30.79% to 37.62%.

World Action Models: The Next Frontier in Embodied AI

cs.RO · 2026-05-12 · unverdicted · novelty 4.0

The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.

citing papers explorer

Showing 17 of 17 citing papers.

Point Tracking Improves World Action Models cs.RO · 2026-05-22 · unverdicted · none · ref 23
JOPAT jointly models pixels, point tracks, and actions in a diffusion transformer and reports gains over pixel-only baselines on long-horizon robot tasks with occlusion and off-screen motion.
Learn Where Outcomes Diverge: Efficient VLA RL via Probabilistic Chunk Masking cs.LG · 2026-05-15 · unverdicted · none · ref 35
PCM uses success-failure action variance to probabilistically select and mask chunks for gradient updates in GRPO, matching standard success rates with 2.38x wall-clock speedup and 60% lower memory on LIBERO benchmarks.
MultiWorld: Scalable Multi-Agent Multi-View Video World Models cs.CV · 2026-04-20 · unverdicted · none · ref 74
MultiWorld is a scalable framework for multi-agent multi-view video world models that improves controllability and consistency over single-agent baselines in game and robot tasks.
PlayWorld: Learning Robot World Models from Autonomous Play cs.RO · 2026-03-09 · unverdicted · none · ref 50
PlayWorld learns high-fidelity robot world models from unsupervised self-play, producing physically consistent video predictions that outperform models trained on human data and enabling 65% better real-world policy performance via model-based RL.
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos cs.RO · 2026-02-06 · unverdicted · none · ref 131
DreamDojo is a foundation world model pretrained on the largest human video dataset to date that uses continuous latent actions to transfer interaction knowledge and achieves controllable physics simulation after robot post-training.
Reinforcing VLAs in Task-Agnostic World Models cs.AI · 2026-05-12 · unverdicted · none · ref 46 · 2 links
RAW-Dream disentangles world-model learning from task data by using a pre-trained task-agnostic world model and VLM rewards, with dual-noise filtering, to enable zero-shot VLA adaptation in simulation and real settings.
Escaping the Diversity Trap in Robotic Manipulation via Anchor-Centric Adaptation cs.RO · 2026-05-08 · unverdicted · none · ref 75
Anchor-Centric Adaptation escapes the diversity trap by prioritizing repeated demonstrations at core anchors over broad coverage, yielding higher success rates under fixed data budgets in robotic manipulation.
RoboStereo: Dual-Tower 4D Embodied World Models for Unified Policy Optimization cs.CV · 2026-03-13 · unverdicted · none · ref 52
A dual-tower 4D embodied world model called RoboStereo reduces geometric hallucinations and delivers over 97% relative improvement on manipulation tasks via test-time augmentation, imitative learning, and open exploration.
RISE: Self-Improving Robot Policy with Compositional World Model cs.RO · 2026-02-11 · unverdicted · none · ref 100
RISE combines a controllable dynamics model and progress value model into a closed-loop self-improving pipeline that updates robot policies entirely in imagination, reporting over 35% absolute gains on three real-world tasks.
Towards Long-Lived Robots: Continual Learning VLA Models via Reinforcement Fine-Tuning cs.RO · 2026-02-11 · unverdicted · none · ref 81
LifeLong-RFT applies chunking-level on-policy reinforcement learning with Quantized Action Consistency Reward, Continuous Trajectory Alignment Reward, and Format Compliance Reward to fine-tune VLA models, achieving a 22% average success rate gain over supervised fine-tuning on the LIBERO benchmark's
WorldArena 2.0: Extending Embodied World Model Benchmarking on Modality, Functionality and Platform cs.RO · 2026-05-18 · unverdicted · none · ref 41
WorldArena 2.0 extends embodied world model benchmarks to visuotactile perception, interactive policy training, and diverse real and simulated robotic platforms under a unified protocol.
Sword: Style-Robust World Models as Simulators via Dynamic Latent Bootstrapping for VLA Policy Post-Training cs.CV · 2026-05-08 · unverdicted · none · ref 11
Sword improves world model simulators for VLA policies by disentangling visual style from dynamics and bootstrapping latents for better consistency, outperforming baselines on LIBERO in generalization and RL post-training success.
Reconstruction or Semantics? What Makes a Latent Space Useful for Robotic World Models cs.CV · 2026-05-07 · unverdicted · none · ref 82
Semantic latent spaces from pretrained encoders outperform reconstruction-based spaces for robotic world models on planning and downstream policy performance.
STARRY: Spatial-Temporal Action-Centric World Modeling for Robotic Manipulation cs.RO · 2026-04-29 · unverdicted · none · ref 43
STARRY uses unified diffusion to align spatial-temporal world predictions with action generation plus GASAM for geometry-aware attention, reaching 93.82%/93.30% success on 50 bimanual tasks in simulation and raising real-world success from 42.5% to 70.8%.
World-Value-Action Model: Implicit Planning for Vision-Language-Action Systems cs.RO · 2026-04-16 · unverdicted · none · ref 33
The World-Value-Action model enables implicit planning for VLA systems by performing inference over a learned latent representation of high-value future trajectories instead of direct action prediction.
Pre-VLA: Preemptive Runtime Verification for Reliable Vision-Language-Action and World-Model Rollouts cs.CV · 2026-05-21 · unverdicted · none · ref 29
Pre-VLA is a multimodal runtime verifier that predicts safety confidence and advantage scores for action chunks, raising closed-loop success rates on the LIBERO benchmark from 30.79% to 37.62%.
World Action Models: The Next Frontier in Embodied AI cs.RO · 2026-05-12 · unverdicted · none · ref 52
The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.

Wmpo: World model-based policy optimization for vision-language-action models

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer