hub

Transformers are sample-efficient world models

Vincent Micheli, Eloi Alonso, François Fleuret · 2022 · arXiv 2209.00588

16 Pith papers cite this work. Polarity classification is still indexing.

16 Pith papers citing it

read on arXiv browse 16 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 4

citation-polarity summary

background 4

representative citing papers

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution

cs.CL · 2023-09-28 · unverdicted · novelty 8.0

Promptbreeder evolves both task prompts and the mutation prompts that improve them using LLMs, outperforming Chain-of-Thought and Plan-and-Solve on arithmetic and commonsense reasoning benchmarks.

JEDI: Joint Embedding Diffusion World Model for Online Model-Based Reinforcement Learning

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

JEDI is the first online end-to-end latent diffusion world model that trains latents from denoising loss rather than reconstruction, achieving competitive Atari100k results with 43% less VRAM and over 3x faster sampling than pixel diffusion baselines.

Render, Don't Decode: Weight-Space World Models with Latent Structural Disentanglement

cs.CV · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

NOVA represents world states as INR weights for decoder-free rendering, compactness, and unsupervised disentanglement of background, foreground, and motion in video world models.

Latent State Design for World Models under Sufficiency Constraints

cs.AI · 2026-05-03 · unverdicted · novelty 7.0

World models succeed when their latent states are built to meet task-specific sufficiency constraints rather than preserving the maximum amount of information.

Agent-Centric Observation Adaptation for Robust Visual Control under Dynamic Perturbations

cs.RO · 2026-04-27 · unverdicted · novelty 7.0 · 2 refs

ACO-MoE recovers 95.3% of clean-input performance in visual control tasks under Markov-switching corruptions by routing restoration experts and anchoring representations to clean foreground masks.

Advantage-Guided Diffusion for Model-Based Reinforcement Learning

cs.AI · 2026-04-10 · unverdicted · novelty 7.0

Advantage-guided diffusion (SAG and EAG) steers sampling in diffusion world models to higher-advantage trajectories, enabling policy improvement and better sample efficiency on MuJoCo tasks.

Training Agents Inside of Scalable World Models

cs.AI · 2025-09-29 · conditional · novelty 7.0

Dreamer 4 is the first agent to obtain diamonds in Minecraft from only offline data by reinforcement learning inside a scalable world model that accurately predicts game mechanics.

Massive Activations in Large Language Models

cs.CL · 2024-02-27 · unverdicted · novelty 7.0

Massive activations are constant large values in LLMs that function as indispensable bias terms and concentrate attention probabilities on specific tokens.

Learning Interactive Real-World Simulators

cs.AI · 2023-10-09 · conditional · novelty 7.0

UniSim learns a universal real-world simulator from orchestrated diverse datasets, enabling zero-shot deployment of policies trained purely in simulation.

Mastering Diverse Domains through World Models

cs.AI · 2023-01-10 · unverdicted · novelty 7.0

DreamerV3 uses world models and robustness techniques to solve over 150 tasks across domains with a single configuration, including Minecraft diamond collection from scratch.

OFlow: Injecting Object-Aware Temporal Flow Matching for Robust Robotic Manipulation

cs.RO · 2026-04-20 · unverdicted · novelty 6.0

OFlow unifies temporal foresight and object-aware reasoning inside a shared latent space via flow matching to improve VLA robustness in robotic manipulation under distribution shifts.

Model-Based Reinforcement Learning under Random Observation Delays

cs.LG · 2025-09-25 · unverdicted · novelty 6.0

A delay-aware model-based RL framework with sequential belief filtering handles random out-of-sequence observations in POMDPs and outperforms MDP baselines while showing robustness to delay shifts.

DAWM: Diffusion Action World Models for Offline Reinforcement Learning via Action-Inferred Transitions

cs.LG · 2025-09-23 · unverdicted · novelty 6.0

DAWM introduces a modular diffusion world model with an inverse dynamics model to produce complete synthetic transitions that improve conservative offline RL algorithms like TD3BC and IQL on D4RL tasks.

Physically Interpretable World Models via Weakly Supervised Representation Learning

cs.LG · 2024-12-17 · unverdicted · novelty 6.0

PIWM aligns latent states in image-based world models with physical variables and constrains their dynamics to known equations via weak distribution supervision, yielding accurate long-horizon predictions and parameter recovery on Cart Pole, Lunar Lander, and Donkey Car.

DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning

cs.RO · 2024-11-07 · unverdicted · novelty 6.0

DINO-WM builds world models on pre-trained DINOv2 features to enable zero-shot planning from offline data without rewards or demonstrations.

Self-supervised Hierarchical Visual Reasoning with World Model

cs.AI · 2026-05-17

citing papers explorer

Showing 16 of 16 citing papers.

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution cs.CL · 2023-09-28 · unverdicted · none · ref 231
Promptbreeder evolves both task prompts and the mutation prompts that improve them using LLMs, outperforming Chain-of-Thought and Plan-and-Solve on arithmetic and commonsense reasoning benchmarks.
JEDI: Joint Embedding Diffusion World Model for Online Model-Based Reinforcement Learning cs.LG · 2026-05-13 · unverdicted · none · ref 46
JEDI is the first online end-to-end latent diffusion world model that trains latents from denoising loss rather than reconstruction, achieving competitive Atari100k results with 43% less VRAM and over 3x faster sampling than pixel diffusion baselines.
Render, Don't Decode: Weight-Space World Models with Latent Structural Disentanglement cs.CV · 2026-05-07 · unverdicted · none · ref 17 · 2 links
NOVA represents world states as INR weights for decoder-free rendering, compactness, and unsupervised disentanglement of background, foreground, and motion in video world models.
Latent State Design for World Models under Sufficiency Constraints cs.AI · 2026-05-03 · unverdicted · none · ref 46
World models succeed when their latent states are built to meet task-specific sufficiency constraints rather than preserving the maximum amount of information.
Agent-Centric Observation Adaptation for Robust Visual Control under Dynamic Perturbations cs.RO · 2026-04-27 · unverdicted · none · ref 40 · 2 links
ACO-MoE recovers 95.3% of clean-input performance in visual control tasks under Markov-switching corruptions by routing restoration experts and anchoring representations to clean foreground masks.
Advantage-Guided Diffusion for Model-Based Reinforcement Learning cs.AI · 2026-04-10 · unverdicted · none · ref 7
Advantage-guided diffusion (SAG and EAG) steers sampling in diffusion world models to higher-advantage trajectories, enabling policy improvement and better sample efficiency on MuJoCo tasks.
Training Agents Inside of Scalable World Models cs.AI · 2025-09-29 · conditional · none · ref 67
Dreamer 4 is the first agent to obtain diamonds in Minecraft from only offline data by reinforcement learning inside a scalable world model that accurately predicts game mechanics.
Massive Activations in Large Language Models cs.CL · 2024-02-27 · unverdicted · none · ref 61
Massive activations are constant large values in LLMs that function as indispensable bias terms and concentrate attention probabilities on specific tokens.
Learning Interactive Real-World Simulators cs.AI · 2023-10-09 · conditional · none · ref 252
UniSim learns a universal real-world simulator from orchestrated diverse datasets, enabling zero-shot deployment of policies trained purely in simulation.
Mastering Diverse Domains through World Models cs.AI · 2023-01-10 · unverdicted · none · ref 45
DreamerV3 uses world models and robustness techniques to solve over 150 tasks across domains with a single configuration, including Minecraft diamond collection from scratch.
OFlow: Injecting Object-Aware Temporal Flow Matching for Robust Robotic Manipulation cs.RO · 2026-04-20 · unverdicted · none · ref 46
OFlow unifies temporal foresight and object-aware reasoning inside a shared latent space via flow matching to improve VLA robustness in robotic manipulation under distribution shifts.
Model-Based Reinforcement Learning under Random Observation Delays cs.LG · 2025-09-25 · unverdicted · none · ref 29
A delay-aware model-based RL framework with sequential belief filtering handles random out-of-sequence observations in POMDPs and outperforms MDP baselines while showing robustness to delay shifts.
DAWM: Diffusion Action World Models for Offline Reinforcement Learning via Action-Inferred Transitions cs.LG · 2025-09-23 · unverdicted · none · ref 20
DAWM introduces a modular diffusion world model with an inverse dynamics model to produce complete synthetic transitions that improve conservative offline RL algorithms like TD3BC and IQL on D4RL tasks.
Physically Interpretable World Models via Weakly Supervised Representation Learning cs.LG · 2024-12-17 · unverdicted · none · ref 40
PIWM aligns latent states in image-based world models with physical variables and constrains their dynamics to known equations via weak distribution supervision, yielding accurate long-horizon predictions and parameter recovery on Cart Pole, Lunar Lander, and Donkey Car.
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning cs.RO · 2024-11-07 · unverdicted · none · ref 39
DINO-WM builds world models on pre-trained DINOv2 features to enable zero-shot planning from offline data without rewards or demonstrations.
Self-supervised Hierarchical Visual Reasoning with World Model cs.AI · 2026-05-17 · unreviewed · ref 7

Transformers are sample-efficient world models

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer