Introduces HOI-Edit benchmark with HOI-Eval metric and SCPE self-correcting framework leveraging I2V models for competitive HOI image editing performance.
arXiv preprint arXiv:2308.07234 (2023)
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
PIWM aligns latent states in image-based world models with physical variables and constrains their dynamics to known equations via weak distribution supervision, yielding accurate long-horizon predictions and parameter recovery on Cart Pole, Lunar Lander, and Donkey Car.
SparseWorld is a sparse world model with a Sparse Dreamer module that performs autoregressive rollout of future instances to refine motion prediction and planning, reporting 0.05% collision rate on nuScenes open-loop metrics.
The paper delivers a multi-axis taxonomy for world models that maps architectures, training families, reasoning strategies, and domains from early cognitive foundations through systems such as Dreamer, MuZero, and Sora while noting evaluation gaps.
citing papers explorer
-
Taming I2V models for Image HOI Editing: A Cognitive Benchmark and Agentic Self-Correcting Framework
Introduces HOI-Edit benchmark with HOI-Eval metric and SCPE self-correcting framework leveraging I2V models for competitive HOI image editing performance.
-
Physically Interpretable World Models via Weakly Supervised Representation Learning
PIWM aligns latent states in image-based world models with physical variables and constrains their dynamics to known equations via weak distribution supervision, yielding accurate long-horizon predictions and parameter recovery on Cart Pole, Lunar Lander, and Donkey Car.
-
SparseWorld: Enhancing End-to-End Autonomous Driving via World Models with Sparse Scene Representation
SparseWorld is a sparse world model with a Sparse Dreamer module that performs autoregressive rollout of future instances to refine motion prediction and planning, reporting 0.05% collision rate on nuScenes open-loop metrics.
-
World Models: A Comprehensive Survey of Architectures, Methodologies, Reasoning Paradigms, and Applications
The paper delivers a multi-axis taxonomy for world models that maps architectures, training families, reasoning strategies, and domains from early cognitive foundations through systems such as Dreamer, MuZero, and Sora while noting evaluation gaps.
- ExploreVLA: Dense World Modeling and Exploration for End-to-End Autonomous Driving