RLA-WM predicts residual latent actions via flow matching to create visual feature world models that outperform prior feature-based and diffusion approaches while enabling offline video-based robot RL.
Gradient-based planning with world models.arXiv:2312.17227,
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 6roles
background 1polarities
background 1representative citing papers
Relaxing join orders to a differentiable soft adjacency matrix and optimizing with gradients plus a GNN cost model yields plans that match or beat discrete search while scaling better on graph datasets.
Slot-MPC learns slot representations to build a differentiable object-centric dynamics model that supports efficient gradient-based MPC for robotic manipulation in novel situations.
Dream-MPC refines policy-generated trajectories by gradient ascent in a latent world model with uncertainty regularization and temporal amortization, improving base policy performance and beating gradient-free MPC on 24 continuous control tasks.
MBDPO reformulates policy optimization as a diffusion process over searched trajectories in latent world models to reduce misalignment between search and value learning.
Facial emotion embeddings improve short-term pose forecasting accuracy for emotion-driven motions when fused via normalized gating in a lightweight LSTM world model, but not with simple multimodal fusion.
citing papers explorer
-
Learning Visual Feature-Based World Models via Residual Latent Action
RLA-WM predicts residual latent actions via flow matching to create visual feature world models that outperform prior feature-based and diffusion approaches while enabling offline video-based robot RL.
-
Gradient-Based Join Ordering
Relaxing join orders to a differentiable soft adjacency matrix and optimizing with gradients plus a GNN cost model yields plans that match or beat discrete search while scaling better on graph datasets.
-
Slot-MPC: Goal-Conditioned Model Predictive Control with Object-Centric Representations
Slot-MPC learns slot representations to build a differentiable object-centric dynamics model that supports efficient gradient-based MPC for robotic manipulation in novel situations.
-
Dream-MPC: Gradient-Based Model Predictive Control with Latent Imagination
Dream-MPC refines policy-generated trajectories by gradient ascent in a latent world model with uncertainty regularization and temporal amortization, improving base policy performance and beating gradient-free MPC on 24 continuous control tasks.
-
Scaling World-Model Reinforcement Learning Through Diffusion Policy Optimization
MBDPO reformulates policy optimization as a diffusion process over searched trajectories in latent world models to reduce misalignment between search and value learning.
-
Emotion-Conditioned Short-Horizon Human Pose Forecasting with a Lightweight Predictive World Model
Facial emotion embeddings improve short-term pose forecasting accuracy for emotion-driven motions when fused via normalized gating in a lightweight LSTM world model, but not with simple multimodal fusion.