Fast LeWorldModel

Xiangyu Xu; Yuntian Gao

arxiv: 2606.26217 · v1 · pith:SCUVSRZ3new · submitted 2026-06-24 · 💻 cs.LG · cs.CV· cs.RO

Fast LeWorldModel

Yuntian Gao , Xiangyu Xu This is my paper

Pith reviewed 2026-06-26 01:51 UTC · model grok-4.3

classification 💻 cs.LG cs.CVcs.RO

keywords Fast-LeWMaction-prefix predictionlatent world modelvisual planningJEPAautoregressive rolloutmulti-horizon predictionopen-loop latent loss

0 comments

The pith

Fast-LeWM replaces one-step latent rollouts with parallel action-prefix predictions for faster visual planning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

LeWM performs visual planning by repeatedly applying a one-step latent transition model to candidate action sequences, which incurs high compute cost and lets errors accumulate as the horizon lengthens. Fast-LeWM instead encodes successive prefixes of a candidate action sequence and predicts the corresponding future latents directly and in parallel. This shifts the prediction unit from single transitions to accumulated state changes under action prefixes of different lengths. The change produces lower open-loop latent loss whose growth slows markedly with longer horizons and raises average task success while cutting planning time.

Core claim

Fast-LeWM replaces repeated application of a local one-step latent transition model with action-prefix prediction. Given the current latent and a candidate action sequence, the model encodes the sequence prefixes and predicts the future latents reached after executing those prefixes. Planning then uses the final prefix token to obtain the target latent without iterating through every intermediate imagined state. This yields lower open-loop latent loss that grows significantly more slowly as the rollout horizon increases.

What carries the argument

Action-prefix prediction, which encodes prefixes of an action sequence and maps each prefix to its corresponding future latent in parallel.

If this is right

Planning time decreases substantially relative to autoregressive one-step rollout.
Average success rate rises across the tested tasks.
Open-loop latent loss remains lower and its increase with horizon length becomes slower.
The model learns continuous state evolution under action prefixes of varying lengths rather than isolated one-step transitions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The prefix approach could support planning at horizons that remain computationally prohibitive under sequential rollout.
Direct prefix supervision may transfer to other JEPA-style latent models that currently use autoregressive dynamics.
Avoiding intermediate imagined states could reduce compounding errors in environments where action effects accumulate nonlinearly.
Parallel prefix evaluation might scale more readily to larger action spaces or higher-dimensional observations.

Load-bearing premise

Predicting latents directly from encoded action prefixes produces accurate multi-horizon forecasts without introducing new systematic biases that appear only during closed-loop execution.

What would settle it

A drop in closed-loop task success or a mismatch between open-loop latent loss and actual execution error when using the prefix predictor would falsify the claim of improved multi-horizon accuracy.

read the original abstract

Joint-Embedding Predictive Architectures (JEPAs), including recent LeWorldModel (LeWM), have become a promising foundation for reconstruction-free visual world models. For visual planning, however, LeWM evaluates candidate action sequences by repeatedly applying a local one-step latent transition model. This autoregressive rollout makes planning computationally expensive and exposes the predicted trajectory to accumulated latent errors as the horizon grows. We propose Fast LeWorldModel (Fast-LeWM), a fast latent world model that replaces repeated local rollout with action-prefix prediction. Given the current latent and a candidate action sequence, Fast-LeWM encodes its prefixes and predicts the future latents reached after executing those prefixes in parallel. By making action prefixes the basic prediction unit, Fast-LeWM directly models action effects accumulated to different extents over multiple horizons. This prefix-level supervision forces the model to learn how states continuously evolve under different action prefixes, rather than only fitting one-step state transitions. During planning, the predictor can use the last prefix token from the encoded action sequence to evaluate the corresponding future latent without explicitly rolling through each intermediate imagined state. Across multiple tasks, Fast-LeWM improves average success over LeWM while substantially reducing planning time, achieving lower open-loop latent loss whose growth becomes significantly slower as the rollout horizon increases.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Fast-LeWM swaps autoregressive one-step rollouts for parallel prefix-to-latent predictions, but the abstract supplies zero numbers or controls so the performance claims stay uncheckable.

read the letter

The core change is replacing repeated local latent transitions with direct prediction from encoded action prefixes, using the final prefix token to get the target latent at each horizon. This is presented as new relative to the LeWM baseline and targets the obvious cost of long-horizon planning plus the usual compounding error problem.

The approach is straightforward and the motivation is practical: if you can supervise prefixes in parallel you avoid both the sequential compute and the need to chain one-step predictions. That part of the design is clear and directly addresses a known bottleneck in reconstruction-free visual world models.

The problem is that the abstract states improved success rates, lower open-loop loss, and slower loss growth with horizon, yet gives no tables, no baseline numbers, no ablation on the prefix encoding, and no closed-loop versus open-loop comparison. Without those, the central claim cannot be evaluated. The stress-test concern about possible new biases from independent prefix mappings is also left open; nothing in the abstract shows that the parallel predictions stay consistent with what an autoregressive rollout would have produced at intermediate steps.

This is the kind of incremental architecture tweak that might matter for people already running LeWM-style planners in robotics or model-based RL, but only if the experiments actually hold up. Right now the write-up reads like a promising sketch rather than a completed result.

I would bring it to a reading group only after the full paper is available with the numbers. I would not cite it on the basis of the abstract alone. A serious editor could send it to review if the manuscript contains proper controls and error analysis, but it would need heavy revision on the empirical side.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces Fast-LeWM as an extension to LeWorldModel (LeWM) for visual planning. Instead of repeated one-step autoregressive latent transitions, it encodes action-sequence prefixes and directly predicts the resulting future latents in parallel from the final prefix token. The central claims are that this yields higher average task success, substantially lower planning time, and lower open-loop latent loss whose growth slows with increasing rollout horizon.

Significance. If the reported gains are reproducible and the open-loop improvements translate to closed-loop planning without new biases, the prefix-based formulation would address a key scalability limitation of autoregressive JEPA-style world models, enabling more efficient multi-horizon planning in visual domains.

major comments (2)

[Abstract] Abstract: the performance claims (improved success, reduced planning time, slower loss growth) are stated without any numerical values, number of tasks, baselines, standard errors, or ablation controls, so the magnitude and reliability of the central empirical result cannot be assessed from the provided text.
[Method] Method description of prefix prediction: no verification is supplied that the direct prefix-to-latent mapping produces the same intermediate latents that would arise from sequential application of the original one-step transition model; without this, it remains possible that independent prefix mappings introduce systematic discrepancies that only surface when the predicted latents are used to select actions in closed loop.

minor comments (1)

The abstract would be strengthened by including at least one quantitative result (e.g., success-rate delta or planning-time ratio) to support the stated improvements.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will revise the manuscript to improve clarity and add requested verifications.

read point-by-point responses

Referee: [Abstract] Abstract: the performance claims (improved success, reduced planning time, slower loss growth) are stated without any numerical values, number of tasks, baselines, standard errors, or ablation controls, so the magnitude and reliability of the central empirical result cannot be assessed from the provided text.

Authors: We agree that the abstract lacks quantitative detail. In the revision we will insert specific numbers for average success improvement, planning-time reduction, number of tasks, and any reported standard errors or ablation controls so readers can immediately assess the scale and reliability of the results. revision: yes
Referee: [Method] Method description of prefix prediction: no verification is supplied that the direct prefix-to-latent mapping produces the same intermediate latents that would arise from sequential application of the original one-step transition model; without this, it remains possible that independent prefix mappings introduce systematic discrepancies that only surface when the predicted latents are used to select actions in closed loop.

Authors: The concern is valid. The current manuscript does not contain an explicit side-by-side verification that direct prefix predictions match the intermediate latents obtained by sequential one-step rollout. We will add this verification (open-loop latent comparison across horizons) and discuss any observed discrepancies in the context of closed-loop action selection. revision: yes

Circularity Check

0 steps flagged

No significant circularity; improvements are empirical claims from architectural change, not reductions to fitted inputs or self-citations

full rationale

The paper's central proposal is an architectural replacement of autoregressive one-step rollout with parallel prefix-based latent prediction. No equations, fitted parameters, or self-citations are presented that would make the reported lower open-loop loss or higher task success reduce by construction to a redefinition of prior quantities. The prefix supervision and planning speedup are design choices whose performance is evaluated empirically against LeWM; the derivation chain contains no self-definitional steps, fitted-input predictions, or load-bearing self-citations. This matches the default expectation for non-circular papers.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are described in the abstract.

pith-pipeline@v0.9.1-grok · 5748 in / 980 out tokens · 17430 ms · 2026-06-26T01:51:48.439856+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

37 extracted references · 5 canonical work pages · 4 internal anchors

[1]

Self-supervised learning from images with a joint-embedding predictive architecture

Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, and Nicolas Ballas. Self-supervised learning from images with a joint-embedding predictive architecture. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023

2023
[2]

LeJEPA : Provable and scalable self-supervised learning without the heuristics, 2025

Randall Balestriero and Yann LeCun. LeJEPA : Provable and scalable self-supervised learning without the heuristics, 2025

2025
[3]

VICReg : Variance-invariance-covariance regularization for self-supervised learning

Adrien Bardes, Jean Ponce, and Yann LeCun. VICReg : Variance-invariance-covariance regularization for self-supervised learning. In International Conference on Learning Representations (ICLR), 2022

2022
[5]

Recurrent world models facilitate policy evolution

David Ha and J "u rgen Schmidhuber. Recurrent world models facilitate policy evolution. Advances in Neural Information Processing Systems (NeurIPS), 31, 2018

2018
[6]

Learning latent dynamics for planning from pixels

Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, and James Davidson. Learning latent dynamics for planning from pixels. In International Conference on Machine Learning (ICML), 2019

2019
[7]

Dream to control: Learning behaviors by latent imagination

Danijar Hafner, Timothy Lillicrap, Jimmy Ba, and Mohammad Norouzi. Dream to control: Learning behaviors by latent imagination. In International Conference on Learning Representations (ICLR), 2020

2020
[8]

Mastering diverse control tasks through world models

Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, and Timothy Lillicrap. Mastering diverse control tasks through world models. Nature, 640 0 (8059): 0 647--653, 2025

2025
[9]

TD-MPC2 : Scalable, robust world models for continuous control

Nick Hansen, Hao Su, and Xiaolong Wang. TD-MPC2 : Scalable, robust world models for continuous control. In International Conference on Learning Representations (ICLR), 2024

2024
[10]

Hansen, Hao Su, and Xiaolong Wang

Nicklas A. Hansen, Hao Su, and Xiaolong Wang. Temporal difference learning for model predictive control. In International Conference on Machine Learning (ICML), 2022

2022
[11]

A path towards autonomous machine intelligence, version 0.9.2, 2022-06-27

Yann LeCun et al. A path towards autonomous machine intelligence, version 0.9.2, 2022-06-27. OpenReview, 62 0 (1): 0 1--62, 2022

2022
[12]

LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels

Lucas Maes, Quentin Le Lidec, Damien Scieur, Yann LeCun, and Randall Balestriero. LeWorldModel : Stable end-to-end joint-embedding predictive architecture from pixels. arXiv:2603.19312, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[13]

Maxime Oquab, Timoth 'e e Darcet, Th 'e o Moutakanni, Huy V. Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mido Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Herv 'e J 'e gou, Julien Mairal, ...

2024
[14]

OGBench : Benchmarking offline goal-conditioned RL

Seohong Park, Kevin Frans, Benjamin Eysenbach, and Sergey Levine. OGBench : Benchmarking offline goal-conditioned RL . In International Conference on Learning Representations (ICLR), 2025

2025
[15]

The cross-entropy method: a unified approach to combinatorial optimization, Monte-Carlo simulation, and machine learning, volume 133

Reuven Y Rubinstein and Dirk P Kroese. The cross-entropy method: a unified approach to combinatorial optimization, Monte-Carlo simulation, and machine learning, volume 133. Springer, 2004

2004
[17]

Vlad Sobal, Wancong Zhang, Kyunghyun Cho, Randall Balestriero, Tim G. J. Rudner, and Yann LeCun. Stress-testing offline reward-free reinforcement learning: A case for planning with latent dynamics models. In 7th Robot Learning Workshop: Towards Robots with Human-Level Abilities, 2025

2025
[18]

DeepMind control suite, 2018

Yuval Tassa, Yotam Doron, Alistair Muldal, Tom Erez, Yazhe Li, Diego de Las Casas, David Budden, Abbas Abdolmaleki, Josh Merel, Andrew Lefrancq, Timothy Lillicrap, and Martin Riedmiller. DeepMind control suite, 2018

2018
[19]

DINO-WM : World models on pre-trained visual features enable zero-shot planning

Gaoyue Zhou, Hengkai Pan, Yann LeCun, and Lerrel Pinto. DINO-WM : World models on pre-trained visual features enable zero-shot planning. In International Conference on Machine Learning (ICML), 2025

2025
[20]

Bardes, Adrien and Ponce, Jean and LeCun, Yann , booktitle = ICLR, year =
[21]

Revisiting Feature Prediction for Learning Visual Representations from Video

Revisiting Feature Prediction for Learning Visual Representations from Video , author=. arXiv:2404.08471 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[22]

Self-Supervised Learning From Images With a Joint-Embedding Predictive Architecture , author =
[23]

2004 , publisher=

The cross-entropy method: a unified approach to combinatorial optimization, Monte-Carlo simulation, and machine learning , author=. 2004 , publisher=

2004
[24]

Temporal Difference Learning for Model Predictive Control , author =
[25]

Nature , volume=

Mastering diverse control tasks through world models , author=. Nature , volume=
[26]

Dream to Control: Learning Behaviors by Latent Imagination , author=
[27]

Learning latent dynamics for planning from pixels , author=
[28]

Transactions on Machine Learning Research , year =

Oquab, Maxime and Darcet, Timoth. Transactions on Machine Learning Research , year =
[29]

LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics

Balestriero, Randall and LeCun, Yann , year =. 2511.08544 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv
[30]

Hansen, Nick and Su, Hao and Wang, Xiaolong , booktitle = ICLR, year =
[31]

Park, Seohong and Frans, Kevin and Eysenbach, Benjamin and Levine, Sergey , booktitle = ICLR, year =
[32]

Zhou, Gaoyue and Pan, Hengkai and LeCun, Yann and Pinto, Lerrel , booktitle = ICML, year =
[33]

2026 , eprint =

Hierarchical Planning with Latent World Models , author =. 2026 , eprint =

2026
[34]

DeepMind Control Suite

Tassa, Yuval and Doron, Yotam and Muldal, Alistair and Erez, Tom and Li, Yazhe and de Las Casas, Diego and Budden, David and Abdolmaleki, Abbas and Merel, Josh and Lefrancq, Andrew and Lillicrap, Timothy and Riedmiller, Martin , year =. 1801.00690 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv
[35]

Recurrent World Models Facilitate Policy Evolution , author =
[36]

OpenReview , volume =

A Path Towards Autonomous Machine Intelligence, Version 0.9.2, 2022-06-27 , author =. OpenReview , volume =

2022
[37]

arXiv:2211.10831 , year =

Joint Embedding Predictive Architectures Focus on Slow Features , author =. arXiv:2211.10831 , year =

work page arXiv
[38]

7th Robot Learning Workshop: Towards Robots with Human-Level Abilities , year =

Stress-Testing Offline Reward-Free Reinforcement Learning: A Case for Planning with Latent Dynamics Models , author =. 7th Robot Learning Workshop: Towards Robots with Human-Level Abilities , year =
[39]

Maes, Lucas and Le Lidec, Quentin and Scieur, Damien and LeCun, Yann and Balestriero, Randall , journal =

[1] [1]

Self-supervised learning from images with a joint-embedding predictive architecture

Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, and Nicolas Ballas. Self-supervised learning from images with a joint-embedding predictive architecture. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023

2023

[2] [2]

LeJEPA : Provable and scalable self-supervised learning without the heuristics, 2025

Randall Balestriero and Yann LeCun. LeJEPA : Provable and scalable self-supervised learning without the heuristics, 2025

2025

[3] [3]

VICReg : Variance-invariance-covariance regularization for self-supervised learning

Adrien Bardes, Jean Ponce, and Yann LeCun. VICReg : Variance-invariance-covariance regularization for self-supervised learning. In International Conference on Learning Representations (ICLR), 2022

2022

[4] [5]

Recurrent world models facilitate policy evolution

David Ha and J "u rgen Schmidhuber. Recurrent world models facilitate policy evolution. Advances in Neural Information Processing Systems (NeurIPS), 31, 2018

2018

[5] [6]

Learning latent dynamics for planning from pixels

Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, and James Davidson. Learning latent dynamics for planning from pixels. In International Conference on Machine Learning (ICML), 2019

2019

[6] [7]

Dream to control: Learning behaviors by latent imagination

Danijar Hafner, Timothy Lillicrap, Jimmy Ba, and Mohammad Norouzi. Dream to control: Learning behaviors by latent imagination. In International Conference on Learning Representations (ICLR), 2020

2020

[7] [8]

Mastering diverse control tasks through world models

Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, and Timothy Lillicrap. Mastering diverse control tasks through world models. Nature, 640 0 (8059): 0 647--653, 2025

2025

[8] [9]

TD-MPC2 : Scalable, robust world models for continuous control

Nick Hansen, Hao Su, and Xiaolong Wang. TD-MPC2 : Scalable, robust world models for continuous control. In International Conference on Learning Representations (ICLR), 2024

2024

[9] [10]

Hansen, Hao Su, and Xiaolong Wang

Nicklas A. Hansen, Hao Su, and Xiaolong Wang. Temporal difference learning for model predictive control. In International Conference on Machine Learning (ICML), 2022

2022

[10] [11]

A path towards autonomous machine intelligence, version 0.9.2, 2022-06-27

Yann LeCun et al. A path towards autonomous machine intelligence, version 0.9.2, 2022-06-27. OpenReview, 62 0 (1): 0 1--62, 2022

2022

[11] [12]

LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels

Lucas Maes, Quentin Le Lidec, Damien Scieur, Yann LeCun, and Randall Balestriero. LeWorldModel : Stable end-to-end joint-embedding predictive architecture from pixels. arXiv:2603.19312, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[12] [13]

Maxime Oquab, Timoth 'e e Darcet, Th 'e o Moutakanni, Huy V. Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mido Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Herv 'e J 'e gou, Julien Mairal, ...

2024

[13] [14]

OGBench : Benchmarking offline goal-conditioned RL

Seohong Park, Kevin Frans, Benjamin Eysenbach, and Sergey Levine. OGBench : Benchmarking offline goal-conditioned RL . In International Conference on Learning Representations (ICLR), 2025

2025

[14] [15]

The cross-entropy method: a unified approach to combinatorial optimization, Monte-Carlo simulation, and machine learning, volume 133

Reuven Y Rubinstein and Dirk P Kroese. The cross-entropy method: a unified approach to combinatorial optimization, Monte-Carlo simulation, and machine learning, volume 133. Springer, 2004

2004

[15] [17]

Vlad Sobal, Wancong Zhang, Kyunghyun Cho, Randall Balestriero, Tim G. J. Rudner, and Yann LeCun. Stress-testing offline reward-free reinforcement learning: A case for planning with latent dynamics models. In 7th Robot Learning Workshop: Towards Robots with Human-Level Abilities, 2025

2025

[16] [18]

DeepMind control suite, 2018

Yuval Tassa, Yotam Doron, Alistair Muldal, Tom Erez, Yazhe Li, Diego de Las Casas, David Budden, Abbas Abdolmaleki, Josh Merel, Andrew Lefrancq, Timothy Lillicrap, and Martin Riedmiller. DeepMind control suite, 2018

2018

[17] [19]

DINO-WM : World models on pre-trained visual features enable zero-shot planning

Gaoyue Zhou, Hengkai Pan, Yann LeCun, and Lerrel Pinto. DINO-WM : World models on pre-trained visual features enable zero-shot planning. In International Conference on Machine Learning (ICML), 2025

2025

[18] [20]

Bardes, Adrien and Ponce, Jean and LeCun, Yann , booktitle = ICLR, year =

[19] [21]

Revisiting Feature Prediction for Learning Visual Representations from Video

Revisiting Feature Prediction for Learning Visual Representations from Video , author=. arXiv:2404.08471 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[20] [22]

Self-Supervised Learning From Images With a Joint-Embedding Predictive Architecture , author =

[21] [23]

2004 , publisher=

The cross-entropy method: a unified approach to combinatorial optimization, Monte-Carlo simulation, and machine learning , author=. 2004 , publisher=

2004

[22] [24]

Temporal Difference Learning for Model Predictive Control , author =

[23] [25]

Nature , volume=

Mastering diverse control tasks through world models , author=. Nature , volume=

[24] [26]

Dream to Control: Learning Behaviors by Latent Imagination , author=

[25] [27]

Learning latent dynamics for planning from pixels , author=

[26] [28]

Transactions on Machine Learning Research , year =

Oquab, Maxime and Darcet, Timoth. Transactions on Machine Learning Research , year =

[27] [29]

LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics

Balestriero, Randall and LeCun, Yann , year =. 2511.08544 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv

[28] [30]

Hansen, Nick and Su, Hao and Wang, Xiaolong , booktitle = ICLR, year =

[29] [31]

Park, Seohong and Frans, Kevin and Eysenbach, Benjamin and Levine, Sergey , booktitle = ICLR, year =

[30] [32]

Zhou, Gaoyue and Pan, Hengkai and LeCun, Yann and Pinto, Lerrel , booktitle = ICML, year =

[31] [33]

2026 , eprint =

Hierarchical Planning with Latent World Models , author =. 2026 , eprint =

2026

[32] [34]

DeepMind Control Suite

Tassa, Yuval and Doron, Yotam and Muldal, Alistair and Erez, Tom and Li, Yazhe and de Las Casas, Diego and Budden, David and Abdolmaleki, Abbas and Merel, Josh and Lefrancq, Andrew and Lillicrap, Timothy and Riedmiller, Martin , year =. 1801.00690 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv

[33] [35]

Recurrent World Models Facilitate Policy Evolution , author =

[34] [36]

OpenReview , volume =

A Path Towards Autonomous Machine Intelligence, Version 0.9.2, 2022-06-27 , author =. OpenReview , volume =

2022

[35] [37]

arXiv:2211.10831 , year =

Joint Embedding Predictive Architectures Focus on Slow Features , author =. arXiv:2211.10831 , year =

work page arXiv

[36] [38]

7th Robot Learning Workshop: Towards Robots with Human-Level Abilities , year =

Stress-Testing Offline Reward-Free Reinforcement Learning: A Case for Planning with Latent Dynamics Models , author =. 7th Robot Learning Workshop: Towards Robots with Human-Level Abilities , year =

[37] [39]

Maes, Lucas and Le Lidec, Quentin and Scieur, Damien and LeCun, Yann and Balestriero, Randall , journal =