NeuROK: Generative 4D Neural Object Kinematics
Pith reviewed 2026-06-29 08:18 UTC · model grok-4.3
The pith
A learned latent space for all object states lets dynamics be simulated using Lagrangian mechanics only in that low-dimensional space.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that learning both a latent space representing all possible states of the object and a decoder that maps any sampled latent to a plausibly deformed shape of the object significantly simplifies the generation of simulative dynamics, since only the dynamics within this low-dimensional latent space need to be considered from the Lagrangian mechanics perspective in classical physics. The resulting transformer-based model demonstrates effectiveness and generality across diverse dynamic object types.
What carries the argument
Neural Object Kinematics (NeuROK): the learned latent space of object states together with its decoder to deformed shapes, which carries the argument by moving all physics simulation into that latent space.
If this is right
- Dynamics simulation reduces to operating only inside the low-dimensional latent space rather than the full 3D geometry.
- The same trained model applies to many different dynamic object categories without requiring new physical models or parameter fitting.
- Realistic temporal deformations arise under varied physical conditions directly from the learned kinematic parameterization.
- The framework produces clear improvements over earlier methods that rely on predefined physical models and system identification.
Where Pith is reading between the lines
- If the latent space is complete, new dynamic sequences could be produced simply by choosing different initial latent states and integrating the latent equations forward.
- The approach could be attached to existing static 3D generative models to supply the missing time dimension without retraining the geometry generator.
- Efficiency gains would appear if the latent dynamics themselves can be learned or approximated faster than full 3D physics solvers.
Load-bearing premise
A data-driven latent space learned from a curated large-scale 4D dataset can represent every possible state of an object and its decoder can map any latent sample to a plausibly deformed shape.
What would settle it
Apply the trained latent dynamics to an object type absent from the training set and check whether the resulting 4D deformation sequences match independent physical observations or ground-truth simulations.
Figures
read the original abstract
Data-driven approaches have revolutionized 3D vision, enabling transformers to effectively reconstruct and generate static 3D objects. However, generating simulative 4D dynamics -- realistic temporal deformations of static objects under various physical conditions -- remains challenging and often ad hoc, despite its importance in building comprehensive 3D world models. Most existing methods assume a predefined physical model and use system identification to estimate parameters, restricting these methods to specific categories and small-scale datasets. We propose that these restrictions can be overcome by learning a data-driven kinematic state parameterization for object-centric physical systems. Specifically, we learn both a latent space representing all possible states of the object and a decoder that maps any sampled latent to a plausibly deformed shape of the object. We refer to this parameterization as Neural Object Kinematics (NeuROK), and learn a transformer-based encoder-decoder model on a curated large-scale 4D dataset. This formulation and the learned model significantly simplify the generation of simulative dynamics since we only need to consider the dynamics within a low-dimensional latent space from the Lagrangian mechanics' perspective in classical physics. We demonstrate the effectiveness and generality of this neural simulation framework across diverse dynamic object types, showing clear advantages over prior works. Project page: https://chen-geng.com/neurok
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces NeuROK, a transformer-based encoder-decoder that learns a low-dimensional latent space representing all possible states of an object together with a decoder mapping latents to deformed shapes. It claims that simulating dynamics via Lagrangian mechanics directly in this latent space (without predefined physical models or system identification) simplifies 4D generation and yields clear advantages over prior category-specific methods, with demonstrations across diverse dynamic object types on a curated large-scale 4D dataset.
Significance. If the central claim holds—that a purely data-driven latent parameterization plus Lagrangian simulation produces realistic, physically consistent temporal deformations—the approach could materially advance category-agnostic 4D world models by removing the need for hand-specified physics per object class.
major comments (3)
- [Abstract] Abstract: the assertion that the formulation 'significantly simplify[s] the generation of simulative dynamics' and shows 'clear advantages over prior works' is unsupported by any quantitative metrics, ablation studies, or comparisons; the central claim therefore cannot be evaluated from the supplied text.
- [Abstract] Abstract / implied method: the paper states that dynamics are considered 'from the Lagrangian mechanics' perspective in classical physics' yet supplies no description of how the kinetic and potential energy terms are instantiated or learned inside the latent space, nor whether any conservation laws or boundary conditions are enforced during roll-out; this is load-bearing for the physical-consistency claim.
- [Abstract] Abstract: the weakest modeling assumption—that any trajectory sampled in the learned latent space corresponds to a physically valid shape sequence—is stated without empirical test or constraint, leaving open whether the decoder can produce implausible deformations under simulated dynamics.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below with honest responses and indicate where revisions will be made to the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion that the formulation 'significantly simplify[s] the generation of simulative dynamics' and shows 'clear advantages over prior works' is unsupported by any quantitative metrics, ablation studies, or comparisons; the central claim therefore cannot be evaluated from the supplied text.
Authors: We agree the abstract overstates the claims without direct support. The full paper contains quantitative comparisons and ablations in the experiments section, but these are not referenced in the abstract. We will revise the abstract to remove the unsubstantiated assertions about simplification and advantages, replacing them with a neutral description of the approach and directing readers to the results. revision_made = 'yes' revision: yes
-
Referee: [Abstract] Abstract / implied method: the paper states that dynamics are considered 'from the Lagrangian mechanics' perspective in classical physics' yet supplies no description of how the kinetic and potential energy terms are instantiated or learned inside the latent space, nor whether any conservation laws or boundary conditions are enforced during roll-out; this is load-bearing for the physical-consistency claim.
Authors: The method section details the latent parameterization and Lagrangian simulation, but we acknowledge the abstract and high-level description lack explicit equations or implementation details for the energy terms. We will expand the method section with a dedicated subsection describing how kinetic and potential energies are defined and optimized in latent space, along with any conservation enforcement during rollout. revision_made = 'yes' revision: yes
-
Referee: [Abstract] Abstract: the weakest modeling assumption—that any trajectory sampled in the learned latent space corresponds to a physically valid shape sequence—is stated without empirical test or constraint, leaving open whether the decoder can produce implausible deformations under simulated dynamics.
Authors: This concern is valid; the current experiments rely on data-driven training to promote plausibility but do not include explicit tests of physical validity for arbitrary latent trajectories. We will add a new analysis subsection with qualitative and quantitative checks on decoder outputs under simulated dynamics, plus discussion of limitations. revision_made = 'partial' revision: partial
Circularity Check
No significant circularity: data-driven latent parameterization trained on external 4D dataset
full rationale
The paper learns a latent space and decoder from a curated large-scale 4D dataset to represent object states, then simulates dynamics in that latent space using the Lagrangian perspective. No equations, self-citations, or fitted parameters are shown reducing the central claim to its own inputs by construction. The approach is explicitly data-driven rather than self-referential, with the Lagrangian application serving as an interpretive framework on independently learned components. This matches the default expectation of non-circularity for data-driven methods without load-bearing self-references or definitional loops.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Lagrangian mechanics perspective can be applied to dynamics within the learned low-dimensional latent space to generate realistic deformations
Reference graph
Works this paper leans on
-
[1]
Fast and deep facial deformations.ACM Transactions on Graphics (TOG), 39(4):94–1, 2020
Stephen W Bailey, Dalton Omens, Paul Dilorenzo, and James F O’Brien. Fast and deep facial deformations.ACM Transactions on Graphics (TOG), 39(4):94–1, 2020. 3
2020
-
[2]
Learning data-driven discretizations for partial differential equations.Proceedings of the National Academy of Sciences, 116(31):15344–15349, 2019
Yohai Bar-Sinai, Stephan Hoyer, Jason Hickey, and Michael P Brenner. Learning data-driven discretizations for partial differential equations.Proceedings of the National Academy of Sciences, 116(31):15344–15349, 2019. 3
2019
-
[3]
Interaction networks for learning about objects, relations and physics.Advances in neural information processing systems, 29, 2016
Peter Battaglia, Razvan Pascanu, Matthew Lai, Danilo Jimenez Rezende, et al. Interaction networks for learning about objects, relations and physics.Advances in neural information processing systems, 29, 2016. 3
2016
-
[4]
Simulation as an engine of physical scene under- standing.Proceedings of the national academy of sciences, 110(45):18327–18332, 2013
Peter W Battaglia, Jessica B Hamrick, and Joshua B Tenen- baum. Simulation as an engine of physical scene under- standing.Proceedings of the national academy of sciences, 110(45):18327–18332, 2013. 3
2013
-
[5]
A sur- vey of projection-based model reduction methods for para- metric dynamical systems.SIAM review, 57(4):483–531,
Peter Benner, Serkan Gugercin, and Karen Willcox. A sur- vey of projection-based model reduction methods for para- metric dynamical systems.SIAM review, 57(4):483–531,
-
[6]
Learn- ing articulated rigid body dynamics with lagrangian graph neural network.Advances in Neural Information Process- ing Systems, 35:29789–29800, 2022
Ravinder Bhattoo, Sayan Ranu, and NM Krishnan. Learn- ing articulated rigid body dynamics with lagrangian graph neural network.Advances in Neural Information Process- ing Systems, 35:29789–29800, 2022. 3
2022
-
[7]
Numerical methods for data science, 2018
David Bindel. Numerical methods for data science, 2018. 6
2018
-
[8]
Face recognition based on fitting a 3d morphable model.IEEE Transactions on pat- tern analysis and machine intelligence, 25(9):1063–1074,
V olker Blanz and Thomas Vetter. Face recognition based on fitting a 3d morphable model.IEEE Transactions on pat- tern analysis and machine intelligence, 25(9):1063–1074,
-
[9]
A 3d morphable model learnt from 10,000 faces
James Booth, Anastasios Roussos, Stefanos Zafeiriou, Al- lan Ponniah, and David Dunaway. A 3d morphable model learnt from 10,000 faces. InProceedings of the IEEE con- ference on computer vision and pattern recognition, pages 5543–5552, 2016. 3
2016
-
[10]
Projective dynamics: fusing con- straint projections for fast simulation.ACM Transactions on Graphics (TOG), 33(4):1–11, 2014
Sofien Bouaziz, Sebastian Martin, Tiantian Liu, Ladislav Kavan, and Mark Pauly. Projective dynamics: fusing con- straint projections for fast simulation.ACM Transactions on Graphics (TOG), 33(4):1–11, 2014. 2
2014
-
[11]
Neural defor- mation graphs for globally-consistent non-rigid reconstruc- tion
Aljaz Bozic, Pablo Palafox, Michael Zollhofer, Justus Thies, Angela Dai, and Matthias Nießner. Neural defor- mation graphs for globally-consistent non-rigid reconstruc- tion. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 1450–1459,
-
[12]
Gic: Gaussian-informed continuum for physical property iden- tification and simulation.Advances in Neural Information Processing Systems, 37:75035–75063, 2024
Junhao Cai, Yuji Yang, Weihao Yuan, Yisheng He, Zilong Dong, Liefeng Bo, Hui Cheng, and Qifeng Chen. Gic: Gaussian-informed continuum for physical property iden- tification and simulation.Advances in Neural Information Processing Systems, 37:75035–75063, 2024. 2
2024
-
[13]
A Compositional Object-Based Approach to Learning Physical Dynamics
Michael B Chang, Tomer Ullman, Antonio Torralba, and Joshua B Tenenbaum. A compositional object-based ap- proach to learning physical dynamics.arXiv preprint arXiv:1612.00341, 2016. 3
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[14]
Licrom: Linear-subspace continuous reduced order model- ing with neural fields
Yue Chang, Peter Yichen Chen, Zhecheng Wang, Maur- izio M Chiaramonte, Kevin Carlberg, and Eitan Grinspun. Licrom: Linear-subspace continuous reduced order model- ing with neural fields. InSIGGRAPH Asia 2023 Conference Papers, pages 1–12, 2023. 3
2023
-
[15]
Physgen3d: Crafting a miniature interactive world from a single image
Boyuan Chen, Hanxiao Jiang, Shaowei Liu, Saurabh Gupta, Yunzhu Li, Hao Zhao, and Shenlong Wang. Physgen3d: Crafting a miniature interactive world from a single image. InProceedings of the Computer Vision and Pattern Recog- nition Conference, pages 6178–6189, 2025. 2
2025
-
[16]
Vid2sim: Generalizable, video-based reconstruction of ap- pearance, geometry and physics for mesh-free simulation
Chuhao Chen, Zhiyang Dou, Chen Wang, Yiming Huang, Anjun Chen, Qiao Feng, Jiatao Gu, and Lingjie Liu. Vid2sim: Generalizable, video-based reconstruction of ap- pearance, geometry and physics for mesh-free simulation. InProceedings of the Computer Vision and Pattern Recog- nition Conference, pages 26545–26555, 2025. 2
2025
-
[17]
Freeart3d: Training-free articulated object generation using 3d diffusion
Chuhao Chen, Isabella Liu, Xinyue Wei, Hao Su, and Minghua Liu. Freeart3d: Training-free articulated object generation using 3d diffusion. InSIGGRAPH Asia 2025 Conference Papers, 2025. 6, 8
2025
-
[18]
Implicit neural spatial representa- tions for time-dependent pdes
Honglin Chen, Rundi Wu, Eitan Grinspun, Changxi Zheng, and Peter Yichen Chen. Implicit neural spatial representa- tions for time-dependent pdes. InInternational Conference on Machine Learning, pages 5162–5177. PMLR, 2023. 3
2023
-
[19]
Peter Yichen Chen, Jinxu Xiang, Dong Heon Cho, Yue Chang, GA Pershing, Henrique Teles Maia, Maurizio M Chiaramonte, Kevin Carlberg, and Eitan Grinspun. Crom: Continuous reduced-order modeling of pdes using implicit neural representations.arXiv preprint arXiv:2206.02607,
-
[20]
Model reduction for the material point method via an implicit neural representation of the deformation map.Journal of Computational Physics, 478: 111908, 2023
Peter Yichen Chen, Maurizio M Chiaramonte, Eitan Grin- spun, and Kevin Carlberg. Model reduction for the material point method via an implicit neural representation of the deformation map.Journal of Computational Physics, 478: 111908, 2023. 3
2023
-
[21]
Zoey Chen, Aaron Walsman, Marius Memmel, Kaichun Mo, Alex Fang, Karthikeya Vemuri, Alan Wu, Dieter Fox, and Abhishek Gupta. URDFormer: A pipeline for con- structing articulated simulation environments from real- world images.arXiv preprint arXiv:2405.11656, 2024. 2
-
[22]
Active sub- space methods in theory and practice: applications to krig- ing surfaces.SIAM Journal on Scientific Computing, 36(4): A1500–A1524, 2014
Paul G Constantine, Eric Dow, and Qiqi Wang. Active sub- space methods in theory and practice: applications to krig- ing surfaces.SIAM Journal on Scientific Computing, 36(4): A1500–A1524, 2014. 6 9
2014
-
[23]
Miles Cranmer, Sam Greydanus, Stephan Hoyer, Peter Battaglia, David Spergel, and Shirley Ho. Lagrangian neu- ral networks.arXiv preprint arXiv:2003.04630, 2020. 3
-
[24]
Rishit Dagli, Donglai Xiang, Vismay Modi, Charles Loop, Clement Fuji Tsang, Anka He Chen, Anita Hu, Gavriel State, David IW Levin, and Maria Shugrina. V omp: Predicting volumetric mechanical property fields.arXiv preprint arXiv:2510.22975, 2025. 2
-
[25]
Objaverse-xl: A universe of 10m+ 3d objects.Advances in Neural Information Processing Systems, 36:35799–35813,
Matt Deitke, Ruoshi Liu, Matthew Wallingford, Huong Ngo, Oscar Michel, Aditya Kusupati, Alan Fan, Chris- tian Laforte, Vikram V oleti, Samir Yitzhak Gadre, et al. Objaverse-xl: A universe of 10m+ 3d objects.Advances in Neural Information Processing Systems, 36:35799–35813,
-
[26]
Diffpd: Differentiable projective dynamics.ACM Transactions on Graphics (ToG), 41(2):1–21, 2021
Tao Du, Kui Wu, Pingchuan Ma, Sebastien Wah, Andrew Spielberg, Daniela Rus, and Wojciech Matusik. Diffpd: Differentiable projective dynamics.ACM Transactions on Graphics (ToG), 41(2):1–21, 2021. 2
2021
-
[27]
Worldscore: A unified evaluation benchmark for world generation
Haoyi Duan, Hong-Xing Yu, Sirui Chen, Li Fei-Fei, and Ji- ajun Wu. Worldscore: A unified evaluation benchmark for world generation. InProceedings of the IEEE/CVF inter- national conference on computer vision, 2025. 8
2025
-
[28]
A point set generation network for 3d object reconstruction from a single image
Haoqiang Fan, Hao Su, and Leonidas J Guibas. A point set generation network for 3d object reconstruction from a single image. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 605–613,
-
[29]
Pie-nerf: Physics-based in- teractive elastodynamics with nerf
Yutao Feng, Yintong Shang, Xuan Li, Tianjia Shao, Chen- fanfu Jiang, and Yin Yang. Pie-nerf: Physics-based in- teractive elastodynamics with nerf. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4450–4461, 2024. 2
2024
-
[30]
Simplifying hamiltonian and lagrangian neural networks via explicit constraints.Advances in neural information processing systems, 33:13880–13889, 2020
Marc Finzi, Ke Alexander Wang, and Andrew G Wilson. Simplifying hamiltonian and lagrangian neural networks via explicit constraints.Advances in neural information processing systems, 33:13880–13889, 2020. 3
2020
-
[31]
Latent-space dynamics for re- duced deformable simulation
Lawson Fulton, Vismay Modi, David Duvenaud, David IW Levin, and Alec Jacobson. Latent-space dynamics for re- duced deformable simulation. InComputer graphics forum, pages 379–391. Wiley Online Library, 2019. 3
2019
-
[32]
Learning neural parametric head models
Simon Giebenhain, Tobias Kirschstein, Markos Geor- gopoulos, Martin R ¨unz, Lourdes Agapito, and Matthias Nießner. Learning neural parametric head models. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21003–21012, 2023. 3
2023
-
[33]
Pgc: Physics-based gaussian cloth from a single pose
Michelle Guo, Matt Jen-Yuan Chiang, Igor Santesteban, Nikolaos Sarafianos, Hsiao-yu Chen, Oshri Halimi, Alja ˇz Boˇziˇc, Shunsuke Saito, Jiajun Wu, C Karen Liu, et al. Pgc: Physics-based gaussian cloth from a single pose. InPro- ceedings of the Computer Vision and Pattern Recognition Conference, pages 21215–21225, 2025. 2
2025
-
[34]
Category-agnostic neural object rigging
Guangzhao He, Chen Geng, Shangzhe Wu, and Jiajun Wu. Category-agnostic neural object rigging. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 22078–22088, 2025. 3, 6, 8
2025
-
[35]
Arapreg: An as- rigid-as possible regularization loss for learning deformable shape generators
Qixing Huang, Xiangru Huang, Bo Sun, Zaiwei Zhang, Junfeng Jiang, and Chandrajit Bajaj. Arapreg: An as- rigid-as possible regularization loss for learning deformable shape generators. InProceedings of the IEEE/CVF inter- national conference on computer vision, pages 5815–5825,
-
[36]
Vbench: Comprehensive benchmark suite for video generative models
Ziqi Huang, Yinan He, Jiashuo Yu, Fan Zhang, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit, et al. Vbench: Comprehensive benchmark suite for video generative models. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21807–21818, 2024. 8
2024
-
[37]
Perceiver: General perception with iterative attention
Andrew Jaegle, Felix Gimeno, Andy Brock, Oriol Vinyals, Andrew Zisserman, and Joao Carreira. Perceiver: General perception with iterative attention. InInternational confer- ence on machine learning, pages 4651–4664. PMLR, 2021. 5
2021
-
[38]
Keypointde- former: Unsupervised 3d keypoint discovery for shape con- trol
Tomas Jakab, Richard Tucker, Ameesh Makadia, Jiajun Wu, Noah Snavely, and Angjoo Kanazawa. Keypointde- former: Unsupervised 3d keypoint discovery for shape con- trol. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 12783–12792,
-
[39]
Dyrt: Dynamic response textures for real time deformation simulation with graphics hardware
Doug L James and Dinesh K Pai. Dyrt: Dynamic response textures for real time deformation simulation with graphics hardware. InProceedings of the 29th annual conference on Computer graphics and interactive techniques, pages 582– 585, 2002. 3
2002
-
[40]
Precom- puted acoustic transfer: output-sensitive, accurate sound generation for geometrically complex vibration sources
Doug L James, Jernej Barbi ˇc, and Dinesh K Pai. Precom- puted acoustic transfer: output-sensitive, accurate sound generation for geometrically complex vibration sources. ACM Transactions on Graphics (TOG), 25(3):987–995,
-
[41]
The material point method for simulating continuum materials
Chenfanfu Jiang, Craig Schroeder, Joseph Teran, Alexey Stomakhin, and Andrew Selle. The material point method for simulating continuum materials. InACM SIGGRAPH 2016 courses, pages 1–52, 2016. 2, 4
2016
-
[42]
Phystwin: Physics- informed reconstruction and simulation of deformable ob- jects from videos.ICCV, 2025
Hanxiao Jiang, Hao-Yu Hsu, Kaifeng Zhang, Hsin-Ni Yu, Shenlong Wang, and Yunzhu Li. Phystwin: Physics- informed reconstruction and simulation of deformable ob- jects from videos.ICCV, 2025. 2
2025
-
[43]
Ditto: Building digital twins of articulated objects from interac- tion
Zhenyu Jiang, Cheng-Chun Hsu, and Yuke Zhu. Ditto: Building digital twins of articulated objects from interac- tion. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 5616–5626,
-
[44]
Improving physics-augmented contin- uum neural radiance field-based geometry-agnostic system identification with lagrangian particle optimization
Takuhiro Kaneko. Improving physics-augmented contin- uum neural radiance field-based geometry-agnostic system identification with lagrangian particle optimization. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5470–5480, 2024. 2
2024
-
[45]
Justin Kerr, Chung Min Kim, Mingxuan Wu, Brent Yi, Qianqian Wang, Ken Goldberg, and Angjoo Kanazawa. Robot see robot do: Imitating articulated object manipu- lation with monocular 4d reconstruction.arXiv preprint arXiv:2409.18121, 2024. 2
-
[46]
Auto-Encoding Variational Bayes
Diederik P Kingma and Max Welling. Auto-encoding vari- ational bayes.arXiv preprint arXiv:1312.6114, 2013. 5
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[47]
Mechanics
Lev Davidovich Landau and Evgeni˘i Mikha˘ilovich Lifshitz. Mechanics. CUP Archive, 1960. 2, 4, 6 10
1960
-
[48]
Long Le, Jason Xie, William Liang, Hung-Ju Wang, Yue Yang, Yecheng Jason Ma, Kyle Vedder, Arjun Kr- ishna, Dinesh Jayaraman, and Eric Eaton. Articulate- anything: Automatic modeling of articulated objects via a vision-language foundation model.arXiv preprint arXiv:2410.13882, 2024. 2
-
[49]
Pixie: Fast and generalizable supervised learning of 3d physics from pixels
Long Le, Ryan Lucas, Chen Wang, Chuhao Chen, Dinesh Jayaraman, Eric Eaton, and Lingjie Liu. Pixie: Fast and generalizable supervised learning of 3d physics from pixels. arXiv preprint arXiv:2508.17437, 2025. 8
-
[50]
Model reduction of dy- namical systems on nonlinear manifolds using deep convo- lutional autoencoders.Journal of Computational Physics, 404:108973, 2020
Kookjin Lee and Kevin T Carlberg. Model reduction of dy- namical systems on nonlinear manifolds using deep convo- lutional autoencoders.Journal of Computational Physics, 404:108973, 2020. 3
2020
-
[51]
Nap: Neural 3d articulated object prior.Advances in Neural Information Processing Systems, 36:31878–31894, 2023
Jiahui Lei, Congyue Deng, William B Shen, Leonidas J Guibas, and Kostas Daniilidis. Nap: Neural 3d articulated object prior.Advances in Neural Information Processing Systems, 36:31878–31894, 2023. 2
2023
-
[52]
Behavior-1k: A benchmark for embodied ai with 1,000 ev- eryday activities and realistic simulation
Chengshu Li, Ruohan Zhang, Josiah Wong, Cem Gokmen, Sanjana Srivastava, Roberto Mart ´ın-Mart´ın, Chen Wang, Gabrael Levine, Michael Lingelbach, Jiankai Sun, et al. Behavior-1k: A benchmark for embodied ai with 1,000 ev- eryday activities and realistic simulation. InConference on Robot Learning, pages 80–93. PMLR, 2023. 2
2023
-
[53]
Deformnet: Latent space modeling and dynamics prediction for deformable object manipula- tion
Chenchang Li, Zihao Ai, Tong Wu, Xiaosa Li, Wenbo Ding, and Huazhe Xu. Deformnet: Latent space modeling and dynamics prediction for deformable object manipula- tion. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 14770–14776. IEEE, 2024. 3
2024
-
[54]
Con- trolling diverse robots by inferring jacobian fields with deep networks.Nature, pages 1–7, 2025
Sizhe Lester Li, Annan Zhang, Boyuan Chen, Hanna Ma- tusik, Chao Liu, Daniela Rus, and Vincent Sitzmann. Con- trolling diverse robots by inferring jacobian fields with deep networks.Nature, pages 1–7, 2025. 3
2025
-
[55]
Plasticitynet: Learning to simulate metal, sand, and snow for optimization time in- tegration.Advances in Neural Information Processing Sys- tems, 35:27783–27796, 2022
Xuan Li, Yadi Cao, Minchen Li, Yin Yang, Craig Schroeder, and Chenfanfu Jiang. Plasticitynet: Learning to simulate metal, sand, and snow for optimization time in- tegration.Advances in Neural Information Processing Sys- tems, 35:27783–27796, 2022. 2
2022
-
[56]
Xuan Li, Yi-Ling Qiao, Peter Yichen Chen, Kr- ishna Murthy Jatavallabhula, Ming Lin, Chenfanfu Jiang, and Chuang Gan. Pac-nerf: Physics augmented continuum neural radiance fields for geometry-agnostic system identi- fication.arXiv preprint arXiv:2303.05512, 2023. 2
-
[57]
Dress-1-to- 3: Single image to simulation-ready 3d outfit with diffu- sion prior and differentiable physics.ACM Transactions on Graphics (TOG), 44(4):1–16, 2025
Xuan Li, Chang Yu, Wenxin Du, Ying Jiang, Tianyi Xie, Yunuo Chen, Yin Yang, and Chenfanfu Jiang. Dress-1-to- 3: Single image to simulation-ready 3d outfit with diffu- sion prior and differentiable physics.ACM Transactions on Graphics (TOG), 44(4):1–16, 2025. 2
2025
-
[58]
Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids
Yunzhu Li, Jiajun Wu, Russ Tedrake, Joshua B Tenen- baum, and Antonio Torralba. Learning particle dynamics for manipulating rigid bodies, deformable objects, and flu- ids.arXiv preprint arXiv:1810.01566, 2018. 3
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[59]
Diffcloth: Differentiable cloth simulation with dry fric- tional contact.ACM Transactions on Graphics (TOG), 42 (1):1–20, 2022
Yifei Li, Tao Du, Kui Wu, Jie Xu, and Wojciech Matusik. Diffcloth: Differentiable cloth simulation with dry fric- tional contact.ACM Transactions on Graphics (TOG), 42 (1):1–20, 2022. 2
2022
-
[60]
3d neural scene representations for visuomotor control
Yunzhu Li, Shuang Li, Vincent Sitzmann, Pulkit Agrawal, and Antonio Torralba. 3d neural scene representations for visuomotor control. InConference on Robot Learning, pages 112–123. PMLR, 2022. 3
2022
-
[61]
Learning preconditioners for conjugate gradient pde solvers
Yichen Li, Peter Yichen Chen, Tao Du, and Wojciech Ma- tusik. Learning preconditioners for conjugate gradient pde solvers. InInternational Conference on Machine Learning, pages 19425–19439. PMLR, 2023. 3
2023
-
[62]
Diffavatar: Simulation-ready garment optimization with differentiable simulation
Yifei Li, Hsiao-yu Chen, Egor Larionov, Nikolaos Sarafi- anos, Wojciech Matusik, and Tuur Stuyck. Diffavatar: Simulation-ready garment optimization with differentiable simulation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4368– 4378, 2024. 2
2024
-
[63]
Self- supervised learning of latent space dynamics.Proceedings of the ACM on Computer Graphics and Interactive Tech- niques, 8(4):1–18, 2025
Yue Li, Gene Wei-Chin Lin, Egor Larionov, Aljaz Bozic, Doug Roble, Ladislav Kavan, Stelian Coros, Bernhard Thomaszewski, Tuur Stuyck, and Hsiao-Yu Chen. Self- supervised learning of latent space dynamics.Proceedings of the ACM on Computer Graphics and Interactive Tech- niques, 8(4):1–18, 2025. 3
2025
-
[64]
Fourier Neural Operator for Parametric Partial Differential Equations
Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations.arXiv preprint arXiv:2010.08895, 2020. 3
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[65]
Zizhang Li, Hong-Xing Yu, Wei Liu, Yin Yang, Charles Herrmann, Gordon Wetzstein, and Jiajun Wu. Wonderplay: Dynamic 3d scene generation from a single image and ac- tions.arXiv preprint arXiv:2505.18151, 2025. 2
-
[66]
Differen- tiable cloth simulation for inverse problems.Advances in neural information processing systems, 32, 2019
Junbang Liang, Ming Lin, and Vladlen Koltun. Differen- tiable cloth simulation for inverse problems.Advances in neural information processing systems, 32, 2019. 2
2019
-
[67]
Phys4dgen: Physics-compliant 4d generation with multi-material composition perception
Jiajing Lin, Zhenzhong Wang, Dejun Xu, Shu Jiang, Yun- Peng Gong, and Min Jiang. Phys4dgen: Physics-compliant 4d generation with multi-material composition perception. arXiv preprint arXiv:2411.16800, 2024. 2
-
[68]
Jiajing Lin, Shu Jiang, Qingyuan Zeng, Zhenzhong Wang, and Min Jiang. Visionlaw: Inferring interpretable intrinsic dynamics from visual observations via bilevel optimization. arXiv preprint arXiv:2508.13792, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[69]
Yuchen Lin, Chenguo Lin, Jianjin Xu, and Yadong Mu. Omniphysgs: 3d constitutive gaussians for gen- eral physics-based dynamics generation.arXiv preprint arXiv:2501.18982, 2025. 2, 8
-
[70]
Paris: Part-level reconstruction and motion analysis for articulated objects
Jiayi Liu, Ali Mahdavi-Amiri, and Manolis Savva. Paris: Part-level reconstruction and motion analysis for articulated objects. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 352–363, 2023. 2
2023
-
[71]
Jiayi Liu, Denys Iliash, Angel X Chang, Manolis Savva, and Ali Mahdavi-Amiri. Singapo: Single image controlled generation of articulated parts in objects.arXiv preprint arXiv:2410.16499, 2024. 2, 6, 8
-
[72]
Differentiable robot rendering.arXiv preprint arXiv:2410.13851, 2024
Ruoshi Liu, Alper Canberk, Shuran Song, and Carl V on- drick. Differentiable robot rendering.arXiv preprint arXiv:2410.13851, 2024. 3
-
[73]
Physgen: Rigid-body physics-grounded image- 11 to-video generation
Shaowei Liu, Zhongzheng Ren, Saurabh Gupta, and Shen- long Wang. Physgen: Rigid-body physics-grounded image- 11 to-video generation. InEuropean Conference on Computer Vision, pages 360–378. Springer, 2024. 2
2024
-
[74]
Smpl: a skinned multi-person linear model.ACM Transactions on Graph- ics (TOG), 34(6):1–16, 2015
Matthew Loper, Naureen Mahmood, Javier Romero, Ger- ard Pons-Moll, and Michael J Black. Smpl: a skinned multi-person linear model.ACM Transactions on Graph- ics (TOG), 34(6):1–16, 2015. 3
2015
-
[75]
Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning
Michael Lutter, Christian Ritter, and Jan Peters. Deep la- grangian networks: Using physics as model prior for deep learning.arXiv preprint arXiv:1907.04490, 2019. 3
work page internal anchor Pith review Pith/arXiv arXiv 1907
-
[76]
Learning neural constitutive laws from motion observations for generalizable pde dynamics
Pingchuan Ma, Peter Yichen Chen, Bolei Deng, Joshua B Tenenbaum, Tao Du, Chuang Gan, and Wojciech Matusik. Learning neural constitutive laws from motion observations for generalizable pde dynamics. InInternational Confer- ence on Machine Learning, pages 23279–23300. PMLR,
-
[77]
Xpbd: position-based simulation of compliant constrained dynamics
Miles Macklin, Matthias M ¨uller, and Nuttapong Chentanez. Xpbd: position-based simulation of compliant constrained dynamics. InProceedings of the 9th International Confer- ence on Motion in Games, pages 49–54, 2016. 2
2016
-
[78]
Explorable mesh deformation subspaces from unstructured 3d generative models
Arman Maesumi, Paul Guerrero, Noam Aigerman, Vladimir Kim, Matthew Fisher, Siddhartha Chaudhuri, and Daniel Ritchie. Explorable mesh deformation subspaces from unstructured 3d generative models. InSIGGRAPH Asia 2023 Conference Papers, pages 1–11, 2023. 3
2023
-
[79]
Real2code: Reconstruct articulated objects via code generation.arXiv preprint arXiv:2406.08474, 2024
Zhao Mandi, Yijia Weng, Dominik Bauer, and Shuran Song. Real2code: Reconstruct articulated objects via code generation.arXiv preprint arXiv:2406.08474, 2024. 2
-
[80]
Zhao Mandi, Yifan Hou, Dieter Fox, Yashraj Narang, Ajay Mandlekar, and Shuran Song. Dexmachina: Functional retargeting for bimanual dexterous manipulation.arXiv preprint arXiv:2505.24853, 2025. 2
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.