ArtMesh: Part-Aware Articulated Mesh Fields with Motion-Consistent Dynamics

Dan Wang; Ravi Ramamoorthi; Sylvia Yuan; Xinrui Cui

arxiv: 2605.16582 · v1 · pith:3YFNK5SNnew · submitted 2026-05-15 · 💻 cs.CV

ArtMesh: Part-Aware Articulated Mesh Fields with Motion-Consistent Dynamics

Sylvia Yuan , Dan Wang , Ravi Ramamoorthi , Xinrui Cui This is my paper

Pith reviewed 2026-05-20 18:40 UTC · model grok-4.3

classification 💻 cs.CV

keywords articulated object reconstructionmesh-based modelingmotion consistencypart-aware remeshingdifferentiable rendering3D reconstructionarticulated dynamicsbenchmark dataset

0 comments

The pith

ArtMesh reconstructs articulated objects from images as connected triangle meshes with per-part rigid motions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

ArtMesh is a reconstruction technique that turns multi-view images of an object in start and end poses into an explicit triangle mesh where each semantic part moves as a rigid body. It replaces the unstructured point clouds of earlier pipelines with a structured surface that respects part boundaries through a special remeshing step. Motion consistency is then enforced both on the mesh vertices as they move and on the pixels they render, so the final model obeys the object's connectivity. A reader would care because many real objects have moving parts whose boundaries must stay intact for the reconstruction to be usable in simulation or robotics.

Core claim

The paper establishes that a mesh-native differentiable renderer combined with part-aware restricted Delaunay remeshing produces connected submeshes free of triangles that cross part boundaries; articulation parameters can then be optimized by bidirectional vertex-wise motion consistency on transported vertices together with pixel-wise motion consistency on rendered RGB-D images, yielding higher accuracy in joint parameter estimation and part-level geometry than prior point-based methods on the Articulate-100 benchmark, especially for objects with many movable parts.

What carries the argument

Part-aware restricted Delaunay remeshing, which creates connected submeshes whose triangles stay inside semantic part boundaries so that motion consistency can operate directly on the object's topology.

If this is right

Joint parameters are recovered more accurately than in unstructured point pipelines, especially when an object has many independent moving parts.
Part-level geometry remains coherent because triangles never straddle semantic boundaries.
Motion consistency can be applied directly along the mesh connectivity without extra topology repair steps.
The same mesh representation supports both start-to-end state alignment and direct rendering of intermediate poses.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The explicit mesh output could be fed directly into physics engines to predict how the object will behave under new forces or grasps.
Extending the start-and-end supervision to full video sequences might let the same consistency losses track continuous articulation.
Because the surface is already segmented by construction, the method could supply ready-made part labels for downstream tasks such as affordance prediction.

Load-bearing premise

The remeshing step produces connected submeshes whose triangles do not cross semantic part boundaries.

What would settle it

Running ArtMesh on the Articulate-100 objects with many movable parts and finding no improvement, or a drop, in joint-parameter accuracy relative to 3D Gaussian Splatting baselines.

Figures

Figures reproduced from arXiv: 2605.16582 by Dan Wang, Ravi Ramamoorthi, Sylvia Yuan, Xinrui Cui.

**Figure 1.** Figure 1: ArtMesh reconstructs articulated objects as part-aware connected triangle meshes with per-part rigid motion. (a) Given multi-view observations at two articulation states, our method jointly recovers (i) a part-aware mesh field via per-part restricted Delaunay remeshing that prevents triangles from crossing part boundaries, and (ii) a motion-consistent articulation field trained with a forward– backward cyc… view at source ↗

**Figure 2.** Figure 2: Qualitative comparison of reconstructed surfaces. ArtGS [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of our framework. Given multi-view RGB-D observations of an articulated object at two states t1, t2 ∈ {0, 1}, we reconstruct a pair of part-aware triangle meshes in correspondence and the per-part rigid articulation that relates them. The Part-Aware Mesh Field (Sec. 3.1) represents each state-t space Ω t as a connected mesh whose vertex set V (t) ⊂ Ω t is partitioned into per-part clusters {Vk(t)}… view at source ↗

**Figure 4.** Figure 4: Method components. (a) Part-Aware Restricted Delaunay: after hardening part weights, cross-part triangles (purple, dashed) are dropped and restricted Delaunay is run per cluster, yielding F ⋆ (t) = S k F ⋆ k (t) — manifold within each part, free of cross-part triangles. (b) Differentiable Render: frontto-back alpha compositing of N faces at pixel p, where the n-th face contributes cn αn(p) attenuated b… view at source ↗

**Figure 5.** Figure 5: Sample data from the Articulate-100 benchmark and [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 7.** Figure 7: Qualitative comparisons on representative multi-part ob [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 6.** Figure 6: Qualitative comparison of reconstructed surfaces. ArtGS [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 8.** Figure 8: Results on PARIS [23], included for comparability with prior work. PARIS is a benchmark (12 objects, all two-part) where ArtMesh’s advantages in scaling to high part counts are least exercised.The minor color difference in ground truth and predicted mesh render is due to the results presented being blender rendered reconstructed meshes, not the rasterizer output [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

**Figure 9.** Figure 9: Ablation of four components of ArtMesh on the [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗

**Figure 10.** Figure 10: ArtMesh reconstructions imported into NVIDIA Om [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗

**Figure 11.** Figure 11: Qualitative comparisons on representative multi-part objects from Articulate-100. Full figure containing state 0 reconstructed [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗

**Figure 12.** Figure 12: Failure case under heavy occlusion. When a movable [PITH_FULL_IMAGE:figures/full_fig_p015_12.png] view at source ↗

**Figure 13.** Figure 13: Benchmark overview. For each sample object, we provide RGB, depth, segmentation, and articulation annotations, alongside [PITH_FULL_IMAGE:figures/full_fig_p016_13.png] view at source ↗

read the original abstract

We present ArtMesh, a mesh-native method for reconstructing articulated objects explicitly as connected triangle meshes with per-part rigid motion from multi-view images in start and end states. Existing 3D Gaussian Splatting pipelines for articulated reconstruction inherit the unstructured point-based geometry of their splatting base, which provides no surface topology for reasoning about part boundaries or enforcing motion consistency along the object's connectivity. ArtMesh instead builds on a mesh-based differentiable rendering backbone, enabling part-aware dynamics to act directly on the structured topology. To make the topology compatible with articulation, we introduce part-aware restricted Delaunay remeshing, producing connected submeshes whose triangles do not cross semantic part boundaries. The dynamic mesh field then optimizes articulation using bidirectional Vertex-wise Motion Consistency on transported mesh vertices and Pixel-wise Motion Consistency on rendered RGB-D observations. We introduce Articulate-100, a new benchmark of 100 articulated objects spanning 16 PartNet-Mobility categories. On this benchmark, ArtMesh outperforms prior 3DGS-based pipelines in joint parameter estimation and part-level geometric reconstruction, with the largest gains on objects with many movable parts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ArtMesh moves articulated reconstruction to a mesh backbone with part-aware remeshing and bidirectional motion consistency, and it reports gains on a new benchmark, but the remeshing step's boundary fidelity is asserted without the checks needed to confirm it drives the results.

read the letter

The paper's core move is replacing the unstructured point cloud of 3D Gaussian Splatting with an explicit triangle mesh that carries part topology. They add a restricted Delaunay remeshing step that tries to keep every triangle inside one semantic part, then optimize with vertex-wise motion consistency on the transported mesh and pixel-wise consistency on rendered RGB-D. On the new Articulate-100 benchmark they show better joint parameter recovery and part geometry than prior 3DGS pipelines, especially for objects with many movable parts. That is the actual novelty and the practical claim worth looking at.

Referee Report

1 major / 2 minor

Summary. The paper presents ArtMesh, a mesh-native method for reconstructing articulated objects explicitly as connected triangle meshes with per-part rigid motion from multi-view images in start and end states. It builds on a mesh-based differentiable rendering backbone and introduces part-aware restricted Delaunay remeshing to produce connected submeshes whose triangles do not cross semantic part boundaries. Articulation is optimized via bidirectional Vertex-wise Motion Consistency on transported mesh vertices and Pixel-wise Motion Consistency on rendered RGB-D observations. The method is evaluated on the new Articulate-100 benchmark of 100 objects across 16 PartNet-Mobility categories, where it outperforms prior 3DGS-based pipelines in joint parameter estimation and part-level geometric reconstruction, with largest gains on objects with many movable parts.

Significance. If the core mechanisms hold, the work provides a structured topology-aware alternative to unstructured 3D Gaussian Splatting for articulated reconstruction, enabling direct enforcement of motion consistency along part connectivity. The introduction of the Articulate-100 benchmark is a clear positive contribution that can support future comparisons. The explicit focus on part boundaries and mesh connectivity addresses a recognized limitation of point-based pipelines.

major comments (1)

Abstract: The central claim that motion-consistent dynamics on the structured topology yield superior joint-parameter and part-geometry accuracy rests on the remeshing step producing submeshes whose triangles lie entirely inside semantic parts. The manuscript states that the remeshing “produces connected submeshes whose triangles do not cross semantic part boundaries,” yet provides no quantitative verification (e.g., percentage of crossing edges or Hausdorff distance to ground-truth part interfaces) on the Articulate-100 test set. This property is load-bearing for the reported gains on objects with many movable parts, as any boundary-crossing triangle would couple motion across parts that should remain independent.

minor comments (2)

Abstract: The description of input data (“multi-view images in start and end states”) could be expanded to clarify whether the method is restricted to two-frame pairs or supports longer sequences, as this affects the scope of the motion-consistency terms.
Abstract: Consider adding a brief statement on the typical number of input views and the source of semantic part labels used for remeshing, to help readers assess practical requirements.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the major comment point by point below.

read point-by-point responses

Referee: Abstract: The central claim that motion-consistent dynamics on the structured topology yield superior joint-parameter and part-geometry accuracy rests on the remeshing step producing submeshes whose triangles lie entirely inside semantic parts. The manuscript states that the remeshing “produces connected submeshes whose triangles do not cross semantic part boundaries,” yet provides no quantitative verification (e.g., percentage of crossing edges or Hausdorff distance to ground-truth part interfaces) on the Articulate-100 test set. This property is load-bearing for the reported gains on objects with many movable parts, as any boundary-crossing triangle would couple motion across parts that should remain independent.

Authors: We agree that quantitative verification of the remeshing step would strengthen the central claim. The part-aware restricted Delaunay remeshing incorporates semantic part labels to restrict triangles to within part boundaries, and the manuscript includes qualitative results showing clean separation. However, we did not provide explicit metrics such as crossing-edge percentages or Hausdorff distances in the initial submission. In the revised manuscript we will add these quantitative evaluations on the Articulate-100 test set to directly support the reported gains, especially for objects with many movable parts. revision: yes

Circularity Check

0 steps flagged

No circularity: method introduces independent remeshing and consistency mechanisms

full rationale

The paper presents ArtMesh as a mesh-native reconstruction pipeline that adds part-aware restricted Delaunay remeshing to produce boundary-respecting submeshes and then applies bidirectional vertex-wise and pixel-wise motion consistency on the resulting topology. These components are defined and motivated directly from the need to handle articulation on structured meshes, rather than being fitted to or defined in terms of the final joint-parameter or reconstruction accuracy numbers. The new Articulate-100 benchmark is introduced separately, and performance comparisons are reported against external 3DGS baselines without any self-referential loop that would make the claimed gains tautological. No self-citations, uniqueness theorems, or ansatzes are invoked in the provided text to justify core choices, leaving the derivation chain self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only view yields no explicit free parameters, axioms, or invented entities beyond standard differentiable rendering assumptions; full paper would be needed to audit optimization hyperparameters or topology assumptions.

pith-pipeline@v0.9.0 · 5729 in / 1102 out tokens · 33185 ms · 2026-05-20T18:40:33.849571+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

part-aware restricted Delaunay remeshing, producing connected submeshes whose triangles do not cross semantic part boundaries

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 1 internal anchor

[1]

Urdformer: A pipeline for constructing articulated simulation environments from real-world images,

Zoey Chen, Aaron Walsman, Marius Memmel, Kaichun Mo, Alex Fang, Karthikeya Vemuri, Alan Wu, Dieter Fox, and Abhishek Gupta. Urdformer: A pipeline for constructing articulated simulation environments from real-world images,

work page
[2]

CRC Press Boca Raton, 2013

Siu-Wing Cheng, Tamal Krishna Dey, Jonathan Shewchuk, and Sartaj Sahni.Delaunay mesh generation. CRC Press Boca Raton, 2013. 5

work page 2013
[3]

Articulate your nerf: Unsupervised articulated object modeling via con- ditional view synthesis.arXiv preprint arXiv:2406.16623,

Jianning Deng, Kartic Subr, and Hakan Bilen. Articulate your nerf: Unsupervised articulated object modeling via con- ditional view synthesis.arXiv preprint arXiv:2406.16623,

work page arXiv
[4]

Partrm: Modeling part-level dynamics with large cross-state reconstruction model, 2025

Mingju Gao, Yike Pan, Huan ang Gao, Zongzheng Zhang, Wenyi Li, Hao Dong, Hao Tang, Li Yi, and Hao Zhao. Partrm: Modeling part-level dynamics with large cross-state reconstruction model, 2025. 3

work page 2025
[5]

Gapartnet: Cross- category domain-generalizable object perception and manip- ulation via generalizable and actionable parts.arXiv preprint arXiv:2211.05272, 2022

Haoran Geng, Helin Xu, Chengyang Zhao, Chao Xu, Li Yi, Siyuan Huang, and He Wang. Gapartnet: Cross- category domain-generalizable object perception and manip- ulation via generalizable and actionable parts.arXiv preprint arXiv:2211.05272, 2022

work page arXiv 2022
[6]

Sage: Bridging semantic and actionable parts for generalizable articulated-object manipu- lation under language instructions, 2023

Haoran Geng, Songlin Wei, Congyue Deng, Bokui Shen, He Wang, and Leonidas Guibas. Sage: Bridging semantic and actionable parts for generalizable articulated-object manipu- lation under language instructions, 2023. 3

work page 2023
[7]

Sugar: Surface- aligned gaussian splatting for efficient 3d mesh reconstruc- tion and high-quality mesh rendering.CVPR, 2024

Antoine Gu ´edon and Vincent Lepetit. Sugar: Surface- aligned gaussian splatting for efficient 3d mesh reconstruc- tion and high-quality mesh rendering.CVPR, 2024. 3

work page 2024
[8]

Milo: Mesh- in-the-loop gaussian splatting for detailed and efficient sur- face reconstruction.ACM Transactions on Graphics, 2025

Antoine Gu ´edon, Diego Gomez, Nissim Maruani, Bingchen Gong, George Drettakis, and Maks Ovsjanikov. Milo: Mesh- in-the-loop gaussian splatting for detailed and efficient sur- face reconstruction.ACM Transactions on Graphics, 2025. 3

work page 2025
[9]

Articulatedgs: Self-supervised digital twin modeling of articulated objects using 3d gaussian splatting

Junfu Guo, Yu Xin, Gaoyi Liu, Kai Xu, Ligang Liu, and Ruizhen Hu. Articulatedgs: Self-supervised digital twin modeling of articulated objects using 3d gaussian splatting. arXiv preprint arXiv:2503.08135, 2025. 2, 3

work page arXiv 2025
[10]

G Lin, Marc Van Droogenbroeck, and Andrea Tagliasac- chi

Jan Held, Sanghyun Son, Renaud Vandeghen, Daniel Re- bain, Matheus Gadelha, Yi Zhou, Anthony Cioppa, Ming C. G Lin, Marc Van Droogenbroeck, and Andrea Tagliasac- chi. Meshsplatting: Differentiable rendering with opaque meshes.arXiv, 2025. 2, 3, 14

work page 2025
[11]

Triangle splatting for real-time radi- ance field rendering.arXiv, 2025

Jan Held, Renaud Vandeghen, Adrien Deliege, Abdul- lah Hamdi, Anthony Cioppa, Silvio Giancola, Andrea Vedaldi, Bernard Ghanem, Andrea Tagliasacchi, and Marc Van Droogenbroeck. Triangle splatting for real-time radi- ance field rendering.arXiv, 2025. 3

work page 2025
[12]

2d gaussian splatting for geometrically accu- rate radiance fields

Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao. 2d gaussian splatting for geometrically accu- rate radiance fields. InSIGGRAPH 2024 Conference Papers. Association for Computing Machinery, 2024. 3

work page 2024
[13]

Huang, H

Siyuan Huang, Haonan Chang, Yuhan Liu, Yimeng Zhu, Hao Dong, Peng Gao, Abdeslam Boularias, and Hongsheng Li. A3vlm: Actionable articulation-aware vision language model.arXiv preprint arXiv:2406.07549, 2024. 3

work page arXiv 2024
[14]

Opd: Single-view 3d openable part detection

Hanxiao Jiang, Yongsen Mao, Manolis Savva, and Angel X Chang. Opd: Single-view 3d openable part detection. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIX, pages 410–426. Springer, 2022. 3

work page 2022
[15]

Ditto: Building digital twins of articulated objects from interaction

Zhenyu Jiang, Cheng-Chun Hsu, and Yuke Zhu. Ditto: Building digital twins of articulated objects from interaction. InarXiv preprint arXiv:2202.08227, 2022. 2, 3

work page arXiv 2022
[16]

Neu- ral 3d mesh renderer

Hiroharu Kato, Yoshitaka Ushiku, and Tatsuya Harada. Neu- ral 3d mesh renderer. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. 3

work page 2018
[17]

3d gaussian splatting for real-time radiance field rendering, 2023

Bernhard Kerbl, Georgios Kopanas, Thomas Leimk ¨uhler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering, 2023. 2, 3

work page 2023
[18]

Screwsplat: An end-to-end method for articulated object recognition.arXiv preprint arXiv:2508.02146, 2025

Seungyeon Kim, Junsu Ha, Young Hun Kim, Yonghyeon Lee, and Frank C Park. Screwsplat: An end-to-end method for articulated object recognition.arXiv preprint arXiv:2508.02146, 2025. 2, 3

work page arXiv 2025
[19]

Articulate-anything: Automatic modeling of articulated objects via a vision-language foundation model.arXiv preprint arXiv:2410.13882, 2024

Long Le, Jason Xie, William Liang, Hung-Ju Wang, Yue Yang, Yecheng Jason Ma, Kyle Vedder, Arjun Krishna, Di- nesh Jayaraman, and Eric Eaton. Articulate-anything: Auto- matic modeling of articulated objects via a vision-language foundation model.arXiv preprint arXiv:2410.13882, 2024. 3

work page arXiv 2024
[20]

Manipllm: Embodied multimodal large language model for object-centric robotic manipulation, 2023

Xiaoqi Li, Mingxu Zhang, Yiran Geng, Haoran Geng, Yux- ing Long, Yan Shen, Renrui Zhang, Jiaming Liu, and Hao Dong. Manipllm: Embodied multimodal large language model for object-centric robotic manipulation, 2023

work page 2023
[21]

Urdf- anything: Constructing articulated objects with 3d multi- modal language model, 2025

Zhe Li, Xiang Bai, Jieyu Zhang, Zhuangzhe Wu, Che Xu, Ying Li, Chengkai Hou, and Shanghang Zhang. Urdf- anything: Constructing articulated objects with 3d multi- modal language model, 2025. 3

work page 2025
[22]

Shengjie Lin, Jiading Fang, Muhammad Zubair Irshad, Vi- tor Campagnolo Guizilini, Rares Andrei Ambrus, Greg Shakhnarovich, and Matthew R. Walter. Splart: Articula- tion estimation and part-level reconstruction with 3d gaus- sian splatting, 2025. 2, 3

work page 2025
[23]

Paris: Part-level reconstruction and motion analysis for articulated objects

Jiayi Liu, Ali Mahdavi-Amiri, and Manolis Savva. Paris: Part-level reconstruction and motion analysis for articulated objects. InProceedings of the IEEE/CVF International Con- ference on Computer Vision, pages 352–363, 2023. 2, 3, 7, 9

work page 2023
[24]

arXiv preprint arXiv:2410.16499 (2024)

Jiayi Liu, Denys Iliash, Angel X Chang, Manolis Savva, and Ali Mahdavi-Amiri. SINGAPO: Single image controlled generation of articulated parts in object.arXiv preprint arXiv:2410.16499, 2024. 3

work page arXiv 2024
[25]

Soft ras- terizer: A differentiable renderer for image-based 3d reason- ing, 2019

Shichen Liu, Tianye Li, Weikai Chen, and Hao Li. Soft ras- terizer: A differentiable renderer for image-based 3d reason- ing, 2019. 3

work page 2019
[26]

Building interactable replicas of complex articulated objects via gaussian splatting

Yu Liu, Baoxiong Jia, Ruijie Lu, Junfeng Ni, Song-Chun Zhu, and Siyuan Huang. Building interactable replicas of complex articulated objects via gaussian splatting. InThe Thirteenth International Conference on Learning Represen- tations, 2025. 2, 3, 7, 8, 9

work page 2025
[27]

Dreamart: Generating interactable articulated ob- jects from a single image, 2025

Ruijie Lu, Yu Liu, Jiaxiang Tang, Junfeng Ni, Yuxiang Wang, Diwen Wan, Gang Zeng, Yixin Chen, and Siyuan Huang. Dreamart: Generating interactable articulated ob- jects from a single image, 2025. 3

work page 2025
[28]

Real2code: Reconstruct articulated objects via code genera- tion, 2024

Zhao Mandi, Yijia Weng, Dominik Bauer, and Shuran Song. Real2code: Reconstruct articulated objects via code genera- tion, 2024. 3

work page 2024
[29]

Srinivasan, Matthew Tancik, Jonathan T

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view syn- thesis. InECCV, 2020. 2, 3

work page 2020
[30]

Where2act: From pixels to actions for articulated 3d objects

Kaichun Mo, Leonidas Guibas, Mustafa Mukadam, Abhinav Gupta, and Shubham Tulsiani. Where2act: From pixels to actions for articulated 3d objects. InInternational Confer- ence on Computer Vision (ICCV), 2021. 3

work page 2021
[31]

Yuille, Nuno Vasconcelos, and Xiaolong Wang

Jiteng Mu, Weichao Qiu, Adam Kortylewski, Alan L. Yuille, Nuno Vasconcelos, and Xiaolong Wang. A-SDF: learning disentangled signed distance functions for articulated shape representation. pages 12981–12991, 2021. 2, 3

work page 2021
[32]

Neural articulated radiance field

Atsuhiro Noguchi, Xiao Sun, Stephen Lin, and Tatsuya Harada. Neural articulated radiance field. InInternational Conference on Computer Vision, 2021. 2, 3

work page 2021
[33]

Gaussianart: Unified modeling of geometry and motion for articulated ob- jects, 2025

Licheng Shen, Saining Zhang, Honghan Li, Peilin Yang, Zi- hao Huang, Zongzheng Zhang, and Hao Zhao. Gaussianart: Unified modeling of geometry and motion for articulated ob- jects, 2025. 2, 3, 7, 8, 9, 14

work page 2025
[34]

Opdmulti: Openable part detection for multiple objects.arXiv preprint arXiv:2303.14087, 2023

Xiaohao Sun, Hanxiao Jiang, Manolis Savva, and An- gel Xuan Chang. Opdmulti: Openable part detection for multiple objects.arXiv preprint arXiv:2303.14087, 2023. 3

work page arXiv 2023
[35]

Maiya, Vatsal Agarwal, and Abhinav Shrivas- tava

Archana Swaminathan, Anubhav Gupta, Kamal Gupta, Shishira R. Maiya, Vatsal Agarwal, and Abhinav Shrivas- tava. Leia: Latent view-invariant embeddings for implicit 3d articulation, 2024. 2, 3

work page 2024
[36]

Cla-nerf: Category-level articulated neural radiance field

Wei-Cheng Tseng, Hung-Ju Liao, Yen-Chen Lin, and Min Sun. Cla-nerf: Category-level articulated neural radiance field. InICRA, 2022. 3

work page 2022
[37]

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping Wang. Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv:2106.10689, 2021. 2, 3

work page internal anchor Pith review Pith/arXiv arXiv 2021
[38]

Shape2motion: Joint analysis of motion parts and attributes from 3d shapes.IEEE Confer- ence on Computer Vision and Pattern, XX(XX):to appear,

Xiaogang Wang, Bin Zhou, Yahao Shi, Xiaowu Chen, Qin- ping Zhao, and Kai Xu. Shape2motion: Joint analysis of motion parts and attributes from 3d shapes.IEEE Confer- ence on Computer Vision and Pattern, XX(XX):to appear,

work page
[39]

Yijia Weng, He Wang, Qiang Zhou, Yuzhe Qin, Yueqi Duan, Qingnan Fan, Baoquan Chen, Hao Su, and Leonidas J. Guibas. Captra: Category-level pose tracking for rigid and articulated objects from point clouds. InProceedings of the IEEE International Conference on Computer Vision (ICCV), pages 13209–13218, 2021. 3

work page 2021
[40]

Neural implicit representation for building digital twins of unknown articulated objects

Yijia Weng, Bowen Wen, Jonathan Tremblay, Valts Blukis, Dieter Fox, Leonidas Guibas, and Stan Birchfield. Neural implicit representation for building digital twins of unknown articulated objects. InCVPR, 2024. 2, 3

work page 2024
[41]

Reartgs: Reconstruct- ing and generating articulated objects via 3d gaussian splat- ting with geometric and motion constraints.arXiv preprint arXiv:2503.06677, 2025

Di Wu, Liu Liu, Zhou Linli, Anran Huang, Liangtu Song, Qiaojun Yu, Qi Wu, and Cewu Lu. Reartgs: Reconstruct- ing and generating articulated objects via 3d gaussian splat- ting with geometric and motion constraints.arXiv preprint arXiv:2503.06677, 2025. 2, 3

work page arXiv 2025
[42]

Chang, Leonidas J

Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, and Hao Su. SAPIEN: A simulated part-based interactive envi- ronment. InThe IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. 3, 7

work page 2020
[43]

Nguyen, and Ismini Lourentzou

Tianjiao Yu, Vedant Shah, Muntasir Wahed, Ying Shen, Kiet A. Nguyen, and Ismini Lourentzou. Part2gs: Part-aware modeling of articulated objects using 3d gaussian splatting,

work page
[44]

Gaussian opacity fields: Efficient adaptive surface reconstruction in unbounded scenes.ACM Transactions on Graphics, 2024

Zehao Yu, Torsten Sattler, and Andreas Geiger. Gaussian opacity fields: Efficient adaptive surface reconstruction in unbounded scenes.ACM Transactions on Graphics, 2024. 3

work page 2024
[45]

Larm: A large articulated-object re- construction model.arXiv preprint arXiv:2511.11563, 2025

Sylvia Yuan, Ruoxi Shi, Xinyue Wei, Xiaoshuai Zhang, Hao Su, and Minghua Liu. Larm: A large articulated-object re- construction model.arXiv preprint arXiv:2511.11563, 2025. 3

work page arXiv 2025
[46]

Rade-gs: Rasterizing depth in gaussian splatting, 2024

Baowen Zhang, Chuan Fang, Rakesh Shrestha, Yixun Liang, Xiaoxiao Long, and Ping Tan. Rade-gs: Rasterizing depth in gaussian splatting, 2024. 3 A. Training Details Scheduling detailsWe train each object for a total of T= 4×10 4 iterations, evenly split into a reconstruction phase (s1 = 2×104) and an articulation phase (s2 = 2×104). The reconstruction phase...

work page arXiv 2024

[1] [1]

Urdformer: A pipeline for constructing articulated simulation environments from real-world images,

Zoey Chen, Aaron Walsman, Marius Memmel, Kaichun Mo, Alex Fang, Karthikeya Vemuri, Alan Wu, Dieter Fox, and Abhishek Gupta. Urdformer: A pipeline for constructing articulated simulation environments from real-world images,

work page

[2] [2]

CRC Press Boca Raton, 2013

Siu-Wing Cheng, Tamal Krishna Dey, Jonathan Shewchuk, and Sartaj Sahni.Delaunay mesh generation. CRC Press Boca Raton, 2013. 5

work page 2013

[3] [3]

Articulate your nerf: Unsupervised articulated object modeling via con- ditional view synthesis.arXiv preprint arXiv:2406.16623,

Jianning Deng, Kartic Subr, and Hakan Bilen. Articulate your nerf: Unsupervised articulated object modeling via con- ditional view synthesis.arXiv preprint arXiv:2406.16623,

work page arXiv

[4] [4]

Partrm: Modeling part-level dynamics with large cross-state reconstruction model, 2025

Mingju Gao, Yike Pan, Huan ang Gao, Zongzheng Zhang, Wenyi Li, Hao Dong, Hao Tang, Li Yi, and Hao Zhao. Partrm: Modeling part-level dynamics with large cross-state reconstruction model, 2025. 3

work page 2025

[5] [5]

Gapartnet: Cross- category domain-generalizable object perception and manip- ulation via generalizable and actionable parts.arXiv preprint arXiv:2211.05272, 2022

Haoran Geng, Helin Xu, Chengyang Zhao, Chao Xu, Li Yi, Siyuan Huang, and He Wang. Gapartnet: Cross- category domain-generalizable object perception and manip- ulation via generalizable and actionable parts.arXiv preprint arXiv:2211.05272, 2022

work page arXiv 2022

[6] [6]

Sage: Bridging semantic and actionable parts for generalizable articulated-object manipu- lation under language instructions, 2023

Haoran Geng, Songlin Wei, Congyue Deng, Bokui Shen, He Wang, and Leonidas Guibas. Sage: Bridging semantic and actionable parts for generalizable articulated-object manipu- lation under language instructions, 2023. 3

work page 2023

[7] [7]

Sugar: Surface- aligned gaussian splatting for efficient 3d mesh reconstruc- tion and high-quality mesh rendering.CVPR, 2024

Antoine Gu ´edon and Vincent Lepetit. Sugar: Surface- aligned gaussian splatting for efficient 3d mesh reconstruc- tion and high-quality mesh rendering.CVPR, 2024. 3

work page 2024

[8] [8]

Milo: Mesh- in-the-loop gaussian splatting for detailed and efficient sur- face reconstruction.ACM Transactions on Graphics, 2025

Antoine Gu ´edon, Diego Gomez, Nissim Maruani, Bingchen Gong, George Drettakis, and Maks Ovsjanikov. Milo: Mesh- in-the-loop gaussian splatting for detailed and efficient sur- face reconstruction.ACM Transactions on Graphics, 2025. 3

work page 2025

[9] [9]

Articulatedgs: Self-supervised digital twin modeling of articulated objects using 3d gaussian splatting

Junfu Guo, Yu Xin, Gaoyi Liu, Kai Xu, Ligang Liu, and Ruizhen Hu. Articulatedgs: Self-supervised digital twin modeling of articulated objects using 3d gaussian splatting. arXiv preprint arXiv:2503.08135, 2025. 2, 3

work page arXiv 2025

[10] [10]

G Lin, Marc Van Droogenbroeck, and Andrea Tagliasac- chi

Jan Held, Sanghyun Son, Renaud Vandeghen, Daniel Re- bain, Matheus Gadelha, Yi Zhou, Anthony Cioppa, Ming C. G Lin, Marc Van Droogenbroeck, and Andrea Tagliasac- chi. Meshsplatting: Differentiable rendering with opaque meshes.arXiv, 2025. 2, 3, 14

work page 2025

[11] [11]

Triangle splatting for real-time radi- ance field rendering.arXiv, 2025

Jan Held, Renaud Vandeghen, Adrien Deliege, Abdul- lah Hamdi, Anthony Cioppa, Silvio Giancola, Andrea Vedaldi, Bernard Ghanem, Andrea Tagliasacchi, and Marc Van Droogenbroeck. Triangle splatting for real-time radi- ance field rendering.arXiv, 2025. 3

work page 2025

[12] [12]

2d gaussian splatting for geometrically accu- rate radiance fields

Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao. 2d gaussian splatting for geometrically accu- rate radiance fields. InSIGGRAPH 2024 Conference Papers. Association for Computing Machinery, 2024. 3

work page 2024

[13] [13]

Huang, H

Siyuan Huang, Haonan Chang, Yuhan Liu, Yimeng Zhu, Hao Dong, Peng Gao, Abdeslam Boularias, and Hongsheng Li. A3vlm: Actionable articulation-aware vision language model.arXiv preprint arXiv:2406.07549, 2024. 3

work page arXiv 2024

[14] [14]

Opd: Single-view 3d openable part detection

Hanxiao Jiang, Yongsen Mao, Manolis Savva, and Angel X Chang. Opd: Single-view 3d openable part detection. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIX, pages 410–426. Springer, 2022. 3

work page 2022

[15] [15]

Ditto: Building digital twins of articulated objects from interaction

Zhenyu Jiang, Cheng-Chun Hsu, and Yuke Zhu. Ditto: Building digital twins of articulated objects from interaction. InarXiv preprint arXiv:2202.08227, 2022. 2, 3

work page arXiv 2022

[16] [16]

Neu- ral 3d mesh renderer

Hiroharu Kato, Yoshitaka Ushiku, and Tatsuya Harada. Neu- ral 3d mesh renderer. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. 3

work page 2018

[17] [17]

3d gaussian splatting for real-time radiance field rendering, 2023

Bernhard Kerbl, Georgios Kopanas, Thomas Leimk ¨uhler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering, 2023. 2, 3

work page 2023

[18] [18]

Screwsplat: An end-to-end method for articulated object recognition.arXiv preprint arXiv:2508.02146, 2025

Seungyeon Kim, Junsu Ha, Young Hun Kim, Yonghyeon Lee, and Frank C Park. Screwsplat: An end-to-end method for articulated object recognition.arXiv preprint arXiv:2508.02146, 2025. 2, 3

work page arXiv 2025

[19] [19]

Articulate-anything: Automatic modeling of articulated objects via a vision-language foundation model.arXiv preprint arXiv:2410.13882, 2024

Long Le, Jason Xie, William Liang, Hung-Ju Wang, Yue Yang, Yecheng Jason Ma, Kyle Vedder, Arjun Krishna, Di- nesh Jayaraman, and Eric Eaton. Articulate-anything: Auto- matic modeling of articulated objects via a vision-language foundation model.arXiv preprint arXiv:2410.13882, 2024. 3

work page arXiv 2024

[20] [20]

Manipllm: Embodied multimodal large language model for object-centric robotic manipulation, 2023

Xiaoqi Li, Mingxu Zhang, Yiran Geng, Haoran Geng, Yux- ing Long, Yan Shen, Renrui Zhang, Jiaming Liu, and Hao Dong. Manipllm: Embodied multimodal large language model for object-centric robotic manipulation, 2023

work page 2023

[21] [21]

Urdf- anything: Constructing articulated objects with 3d multi- modal language model, 2025

Zhe Li, Xiang Bai, Jieyu Zhang, Zhuangzhe Wu, Che Xu, Ying Li, Chengkai Hou, and Shanghang Zhang. Urdf- anything: Constructing articulated objects with 3d multi- modal language model, 2025. 3

work page 2025

[22] [22]

Shengjie Lin, Jiading Fang, Muhammad Zubair Irshad, Vi- tor Campagnolo Guizilini, Rares Andrei Ambrus, Greg Shakhnarovich, and Matthew R. Walter. Splart: Articula- tion estimation and part-level reconstruction with 3d gaus- sian splatting, 2025. 2, 3

work page 2025

[23] [23]

Paris: Part-level reconstruction and motion analysis for articulated objects

Jiayi Liu, Ali Mahdavi-Amiri, and Manolis Savva. Paris: Part-level reconstruction and motion analysis for articulated objects. InProceedings of the IEEE/CVF International Con- ference on Computer Vision, pages 352–363, 2023. 2, 3, 7, 9

work page 2023

[24] [24]

arXiv preprint arXiv:2410.16499 (2024)

Jiayi Liu, Denys Iliash, Angel X Chang, Manolis Savva, and Ali Mahdavi-Amiri. SINGAPO: Single image controlled generation of articulated parts in object.arXiv preprint arXiv:2410.16499, 2024. 3

work page arXiv 2024

[25] [25]

Soft ras- terizer: A differentiable renderer for image-based 3d reason- ing, 2019

Shichen Liu, Tianye Li, Weikai Chen, and Hao Li. Soft ras- terizer: A differentiable renderer for image-based 3d reason- ing, 2019. 3

work page 2019

[26] [26]

Building interactable replicas of complex articulated objects via gaussian splatting

Yu Liu, Baoxiong Jia, Ruijie Lu, Junfeng Ni, Song-Chun Zhu, and Siyuan Huang. Building interactable replicas of complex articulated objects via gaussian splatting. InThe Thirteenth International Conference on Learning Represen- tations, 2025. 2, 3, 7, 8, 9

work page 2025

[27] [27]

Dreamart: Generating interactable articulated ob- jects from a single image, 2025

Ruijie Lu, Yu Liu, Jiaxiang Tang, Junfeng Ni, Yuxiang Wang, Diwen Wan, Gang Zeng, Yixin Chen, and Siyuan Huang. Dreamart: Generating interactable articulated ob- jects from a single image, 2025. 3

work page 2025

[28] [28]

Real2code: Reconstruct articulated objects via code genera- tion, 2024

Zhao Mandi, Yijia Weng, Dominik Bauer, and Shuran Song. Real2code: Reconstruct articulated objects via code genera- tion, 2024. 3

work page 2024

[29] [29]

Srinivasan, Matthew Tancik, Jonathan T

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view syn- thesis. InECCV, 2020. 2, 3

work page 2020

[30] [30]

Where2act: From pixels to actions for articulated 3d objects

Kaichun Mo, Leonidas Guibas, Mustafa Mukadam, Abhinav Gupta, and Shubham Tulsiani. Where2act: From pixels to actions for articulated 3d objects. InInternational Confer- ence on Computer Vision (ICCV), 2021. 3

work page 2021

[31] [31]

Yuille, Nuno Vasconcelos, and Xiaolong Wang

Jiteng Mu, Weichao Qiu, Adam Kortylewski, Alan L. Yuille, Nuno Vasconcelos, and Xiaolong Wang. A-SDF: learning disentangled signed distance functions for articulated shape representation. pages 12981–12991, 2021. 2, 3

work page 2021

[32] [32]

Neural articulated radiance field

Atsuhiro Noguchi, Xiao Sun, Stephen Lin, and Tatsuya Harada. Neural articulated radiance field. InInternational Conference on Computer Vision, 2021. 2, 3

work page 2021

[33] [33]

Gaussianart: Unified modeling of geometry and motion for articulated ob- jects, 2025

Licheng Shen, Saining Zhang, Honghan Li, Peilin Yang, Zi- hao Huang, Zongzheng Zhang, and Hao Zhao. Gaussianart: Unified modeling of geometry and motion for articulated ob- jects, 2025. 2, 3, 7, 8, 9, 14

work page 2025

[34] [34]

Opdmulti: Openable part detection for multiple objects.arXiv preprint arXiv:2303.14087, 2023

Xiaohao Sun, Hanxiao Jiang, Manolis Savva, and An- gel Xuan Chang. Opdmulti: Openable part detection for multiple objects.arXiv preprint arXiv:2303.14087, 2023. 3

work page arXiv 2023

[35] [35]

Maiya, Vatsal Agarwal, and Abhinav Shrivas- tava

Archana Swaminathan, Anubhav Gupta, Kamal Gupta, Shishira R. Maiya, Vatsal Agarwal, and Abhinav Shrivas- tava. Leia: Latent view-invariant embeddings for implicit 3d articulation, 2024. 2, 3

work page 2024

[36] [36]

Cla-nerf: Category-level articulated neural radiance field

Wei-Cheng Tseng, Hung-Ju Liao, Yen-Chen Lin, and Min Sun. Cla-nerf: Category-level articulated neural radiance field. InICRA, 2022. 3

work page 2022

[37] [37]

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping Wang. Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv:2106.10689, 2021. 2, 3

work page internal anchor Pith review Pith/arXiv arXiv 2021

[38] [38]

Shape2motion: Joint analysis of motion parts and attributes from 3d shapes.IEEE Confer- ence on Computer Vision and Pattern, XX(XX):to appear,

Xiaogang Wang, Bin Zhou, Yahao Shi, Xiaowu Chen, Qin- ping Zhao, and Kai Xu. Shape2motion: Joint analysis of motion parts and attributes from 3d shapes.IEEE Confer- ence on Computer Vision and Pattern, XX(XX):to appear,

work page

[39] [39]

Yijia Weng, He Wang, Qiang Zhou, Yuzhe Qin, Yueqi Duan, Qingnan Fan, Baoquan Chen, Hao Su, and Leonidas J. Guibas. Captra: Category-level pose tracking for rigid and articulated objects from point clouds. InProceedings of the IEEE International Conference on Computer Vision (ICCV), pages 13209–13218, 2021. 3

work page 2021

[40] [40]

Neural implicit representation for building digital twins of unknown articulated objects

Yijia Weng, Bowen Wen, Jonathan Tremblay, Valts Blukis, Dieter Fox, Leonidas Guibas, and Stan Birchfield. Neural implicit representation for building digital twins of unknown articulated objects. InCVPR, 2024. 2, 3

work page 2024

[41] [41]

Reartgs: Reconstruct- ing and generating articulated objects via 3d gaussian splat- ting with geometric and motion constraints.arXiv preprint arXiv:2503.06677, 2025

Di Wu, Liu Liu, Zhou Linli, Anran Huang, Liangtu Song, Qiaojun Yu, Qi Wu, and Cewu Lu. Reartgs: Reconstruct- ing and generating articulated objects via 3d gaussian splat- ting with geometric and motion constraints.arXiv preprint arXiv:2503.06677, 2025. 2, 3

work page arXiv 2025

[42] [42]

Chang, Leonidas J

Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, and Hao Su. SAPIEN: A simulated part-based interactive envi- ronment. InThe IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. 3, 7

work page 2020

[43] [43]

Nguyen, and Ismini Lourentzou

Tianjiao Yu, Vedant Shah, Muntasir Wahed, Ying Shen, Kiet A. Nguyen, and Ismini Lourentzou. Part2gs: Part-aware modeling of articulated objects using 3d gaussian splatting,

work page

[44] [44]

Gaussian opacity fields: Efficient adaptive surface reconstruction in unbounded scenes.ACM Transactions on Graphics, 2024

Zehao Yu, Torsten Sattler, and Andreas Geiger. Gaussian opacity fields: Efficient adaptive surface reconstruction in unbounded scenes.ACM Transactions on Graphics, 2024. 3

work page 2024

[45] [45]

Larm: A large articulated-object re- construction model.arXiv preprint arXiv:2511.11563, 2025

Sylvia Yuan, Ruoxi Shi, Xinyue Wei, Xiaoshuai Zhang, Hao Su, and Minghua Liu. Larm: A large articulated-object re- construction model.arXiv preprint arXiv:2511.11563, 2025. 3

work page arXiv 2025

[46] [46]

Rade-gs: Rasterizing depth in gaussian splatting, 2024

Baowen Zhang, Chuan Fang, Rakesh Shrestha, Yixun Liang, Xiaoxiao Long, and Ping Tan. Rade-gs: Rasterizing depth in gaussian splatting, 2024. 3 A. Training Details Scheduling detailsWe train each object for a total of T= 4×10 4 iterations, evenly split into a reconstruction phase (s1 = 2×104) and an articulation phase (s2 = 2×104). The reconstruction phase...

work page arXiv 2024