Make-It-Poseable: Feed-forward Latent Posing Model for 3D Characters

Alan Zhao; Houqiang Li; Jax Xiang; Ori Zhang; Wengang Zhou; Zhenxun Yuan; Zhiyang Guo

arxiv: 2512.16767 · v2 · pith:GNVTFQQ7new · submitted 2025-12-18 · 💻 cs.CV

Make-It-Poseable: Feed-forward Latent Posing Model for 3D Characters

Zhiyang Guo , Ori Zhang , Jax Xiang , Alan Zhao , Zhenxun Yuan , Wengang Zhou , Houqiang Li This is my paper

Pith reviewed 2026-05-16 21:27 UTC · model grok-4.3

classification 💻 cs.CV

keywords 3D character posinglatent space transformationfeed-forward modelskinning-freezero-shot generalizationmesh deformationcomputer graphics

0 comments

The pith

Make-It-Poseable poses 3D characters by transforming compact latent representations instead of meshes or skinning weights.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Make-It-Poseable, a feed-forward framework that reformulates 3D character posing as a skinning-free transformation in latent space. It combines a latent posing transformer, a dense pose representation, and an adaptive completion module trained with a bipartite-matched latent loss to handle topological changes. This setup targets problems in AI-generated assets that have irregular structures and fused geometry. A sympathetic reader would care because the approach claims to deliver higher pose conformance and zero-shot generalization to shapes such as quadrupeds while supporting editing tasks like part replacement.

Core claim

The central claim is that character posing can be recast as direct manipulation of compact latent representations of 3D shapes. The method integrates a latent posing transformer for shape manipulation, a dense pose representation for fine-grained control, and an adaptive completion module optimized via a bipartite-matched latent loss. This skinning-free design bypasses fixed mesh connectivity and traditional rigging constraints, enabling robust reconstruction under arbitrary topological changes.

What carries the argument

Latent posing transformer that performs shape manipulation directly on compact latent representations, decoupled from mesh topology.

If this is right

The method significantly outperforms existing baselines in posing quality.
The skeleton-agnostic design exhibits zero-shot generalization to diverse morphologies including quadrupeds.
It seamlessly supports 3D authoring applications such as part replacement and refinement.
It robustly processes AI-generated assets that exhibit flawed structures and fused geometry.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Animation pipelines could reduce reliance on manual rigging steps for procedurally generated models.
Real-time posing tools in games or VR might incorporate this latent approach for faster iteration on varied character shapes.
Extending the latent space with temporal information could support synthesis of animated sequences from static posed inputs.

Load-bearing premise

Compact latent representations preserve enough geometric detail to reconstruct fine features and handle arbitrary topological changes without mesh-specific priors or artifacts.

What would settle it

Apply the model to AI-generated characters with fused geometry or fine details such as hair and measure whether posed outputs show visible artifacts or loss of detail relative to a high-resolution reference mesh.

Figures

Figures reproduced from arXiv: 2512.16767 by Alan Zhao, Houqiang Li, Jax Xiang, Ori Zhang, Wengang Zhou, Zhenxun Yuan, Zhiyang Guo.

**Figure 1.** Figure 1: Given a 3D humanoid model of arbitrary shape and initial pose, our method efficiently re-poses it in a single feed-forward pass. [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 2.** Figure 2: Pipeline of our character posing framework. Given a source shape and source/target skeletons, we encode them into latent representations with dense correspondence. A latent posing transformer then predicts the target shape tokens, which are finally decoded into the posed mesh. This framework is trained in two stages. First, a latent loss is established to preserve geometric details. Second, an adaptive com… view at source ↗

**Figure 3.** Figure 3: Illustration of our key designs. (a) The skeleton encoder (Sec. 3.2) produces dense pose representations with latent-level oneto-one correspondence. (b) Latent-space supervision (Sec. 3.4) ensures a semantically meaningful token transformation path to preserve geometric details. (c) Adaptive tokens (Sec. 3.5) are introduced in the finetuning stage to handle newly exposed structures after deformation. wher… view at source ↗

**Figure 4.** Figure 4: Qualitative comparison on diverse characters and poses. We showcase results for re-posing each character into a widelyadopted T-pose and an additional random pose. Our method produces high-fidelity results across various cases. It robustly handles challenging inputs where MIA [6] and Puppeteer [25] produce significant artifacts, and gives better pose conformance and detail preservation compared to HY3D-O… view at source ↗

**Figure 5.** Figure 5: Our method enables various applications, [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

Posing 3D characters is a fundamental task in computer graphics. However, existing paradigms, ranging from traditional auto-rigging to recent pose-conditioned generative models, frequently struggle with inaccurate skinning weights, fixed mesh topologies, and poor pose conformance. These challenges have become particularly pronounced with the recent explosion of AI-generated 3D assets, which often exhibit flawed structures and fused geometry. To address these issues, we introduce Make-It-Poseable, a novel feed-forward framework that reformulates character posing as a skinning-free latent-space transformation problem. By decoupling shape deformation from the constraints of fixed mesh connectivity, our method directly operates on compact latent representations to reconstruct characters in target poses. To achieve this, our framework integrates a latent posing transformer for shape manipulation, a dense pose representation for fine-grained control, and an adaptive completion module optimized via a bipartite-matched latent loss to robustly handle topological changes. Extensive experiments demonstrate that our method significantly outperforms existing baselines in posing quality. Furthermore, our skeleton-agnostic design exhibits remarkable zero-shot generalization to diverse morphologies including quadrupeds and seamlessly supports various 3D authoring applications such as part replacement and refinement.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reframes 3D character posing as a skinning-free latent-space transform but its performance claims rest on experiments that are not quantified in the abstract.

read the letter

The main takeaway is that Make-It-Poseable treats posing as a direct transformation inside a compact latent representation instead of relying on rigging or mesh connectivity. This targets the practical mess of AI-generated 3D assets that often have fused geometry and irregular structures. The new pieces are a latent posing transformer, a dense pose representation for control, and an adaptive completion module trained with a bipartite-matched latent loss. The skeleton-agnostic framing and claimed zero-shot behavior on quadrupeds plus part-replacement tasks are distinct from standard auto-rigging or pose-conditioned generators. If the full results hold, the approach could reduce manual cleanup in content pipelines. The abstract states that the method outperforms baselines in quality and generalizes well, but it supplies no numbers, error bars, or ablation tables to back that up. The central risk is exactly the one in the stress test: compact latents commonly discard high-frequency surface detail, and without mesh-specific priors the completion step may not recover clean geometry under topological changes. That concern lands on the zero-shot and refinement claims. The paper is aimed at graphics researchers and tool builders who work with generative 3D assets. A reader looking for fresh framing on posing could pull useful ideas from the architecture, but anyone needing reproducible evidence will have to wait for the full manuscript. I would send it to peer review so referees can check whether the experiments actually support the outperformance and generalization statements.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces Make-It-Poseable, a feed-forward framework that reformulates 3D character posing as a skinning-free latent-space transformation problem. It integrates a latent posing transformer for shape manipulation, a dense pose representation for fine-grained control, and an adaptive completion module optimized via a bipartite-matched latent loss to handle topological changes. The central claims are that the method significantly outperforms existing baselines in posing quality, exhibits zero-shot generalization to diverse morphologies including quadrupeds, and supports 3D authoring applications such as part replacement and refinement, particularly for AI-generated assets with irregular structures.

Significance. If the empirical claims hold with proper validation, this work could meaningfully advance computer graphics by enabling robust posing of AI-generated 3D models without reliance on fixed topologies or accurate skinning weights. The skeleton-agnostic latent-space approach addresses a growing practical need and could influence downstream tasks in 3D content creation.

major comments (2)

[Abstract] Abstract: The claim that the method 'significantly outperforms existing baselines in posing quality' and exhibits 'remarkable zero-shot generalization' is load-bearing for the contribution but is unsupported by any quantitative metrics, error bars, ablation details, or specific experimental results. This absence prevents assessment of the central empirical assertions.
[Method] Method (latent posing transformer and adaptive completion): The assumption that compact latent representations preserve sufficient high-frequency geometric details to reconstruct posed characters without artifacts under arbitrary topological changes (e.g., fused AI-generated geometry) is central to the zero-shot and outperformance claims, yet the manuscript provides no direct evidence or analysis addressing the risk that the encoder discards such information.

minor comments (1)

[Abstract] Abstract: Consider adding one sentence naming the primary baselines used for comparison to contextualize the outperformance claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback and for recognizing the potential impact of our work on posing AI-generated 3D assets. We address each major comment below and have revised the manuscript to improve clarity and provide additional supporting analysis.

read point-by-point responses

Referee: [Abstract] Abstract: The claim that the method 'significantly outperforms existing baselines in posing quality' and exhibits 'remarkable zero-shot generalization' is load-bearing for the contribution but is unsupported by any quantitative metrics, error bars, ablation details, or specific experimental results. This absence prevents assessment of the central empirical assertions.

Authors: We agree that the abstract would benefit from explicit quantitative anchors to allow immediate assessment of the claims. In the revised version we have updated the abstract to include key metrics (e.g., average Chamfer-distance reduction and zero-shot success rate on quadrupeds) drawn directly from the experimental tables, while preserving conciseness. Full results with error bars, statistical significance, and ablation details remain in Section 4. revision: yes
Referee: [Method] Method (latent posing transformer and adaptive completion): The assumption that compact latent representations preserve sufficient high-frequency geometric details to reconstruct posed characters without artifacts under arbitrary topological changes (e.g., fused AI-generated geometry) is central to the zero-shot and outperformance claims, yet the manuscript provides no direct evidence or analysis addressing the risk that the encoder discards such information.

Authors: The referee correctly identifies that the manuscript relies primarily on end-to-end empirical success rather than a direct information-preservation study. To address this, we have added a short analysis subsection and supplementary visualizations that compare high-frequency surface details before and after latent encoding/decoding on the most irregular AI-generated examples. We have also included a latent-dimension ablation that quantifies the point at which reconstruction artifacts appear. These additions provide the requested direct evidence without altering the core method. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation relies on independently trained modules and external baselines

full rationale

The paper presents a feed-forward latent posing framework with a latent posing transformer, dense pose representation, and adaptive completion module trained via bipartite-matched latent loss. These components are described as novel architectural choices optimized end-to-end, with performance evaluated against external baselines rather than internal fitted quantities. No self-definitional equations, fitted inputs renamed as predictions, or load-bearing self-citations appear in the abstract or described chain. The zero-shot generalization claims rest on empirical results for diverse morphologies, not on re-deriving inputs by construction. This is a standard non-circular design for a learned model.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the latent representation and bipartite matching are presented as core technical choices without further decomposition.

pith-pipeline@v0.9.0 · 5521 in / 987 out tokens · 34746 ms · 2026-05-16T21:27:44.744095+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

AniGen: Unified $S^3$ Fields for Animatable 3D Asset Generation
cs.GR 2026-04 unverdicted novelty 7.0

AniGen directly generates animatable 3D assets with consistent shape, skeleton, and skinning from single images using unified S^3 fields and a two-stage flow-matching pipeline.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · cited by 1 Pith paper · 3 internal anchors

[1]

Mixamo, 2024.https://www.mixamo.com

Adobe. Mixamo, 2024.https://www.mixamo.com. 6

work page 2024
[2]

Automatic rigging and ani- mation of 3D characters.ACM TOG, 26(3):72–es, 2007

Ilya Baran and Jovan Popovi ´c. Automatic rigging and ani- mation of 3D characters.ACM TOG, 26(3):72–es, 2007. 3

work page 2007
[3]

Human- Rig: Learning automatic rigging for humanoid character in a large scale dataset, 2024

Zedong Chu, Feng Xiong, Meiduo Liu, Jinzhi Zhang, Mingqi Shao, Zhaoxu Sun, Di Wang, and Mu Xu. Human- Rig: Learning automatic rigging for humanoid character in a large scale dataset, 2024. 1, 3, 6

work page 2024
[4]

DetailGen3D: Generative 3D geometry enhancement via data-dependent flow, 2025

Ken Deng, Yuan-Chen Guo, Jingxiang Sun, Zi-Xin Zou, Yangguang Li, Xin Cai, Yan-Pei Cao, Yebin Liu, and Ding Liang. DetailGen3D: Generative 3D geometry enhancement via data-dependent flow, 2025. 2, 4

work page 2025
[5]

Anymate: A dataset and baselines for learning 3D object rigging

Yufan Deng, Yuhao Zhang, Chen Geng, Shangzhe Wu, and Jiajun Wu. Anymate: A dataset and baselines for learning 3D object rigging. InSIGGRAPH Conference Proceedings, Vancouver, BC, Canada, 2025. Association for Computing Machinery. 3

work page 2025
[6]

Make-It-Animatable: An ef- ficient framework for authoring animation-ready 3D charac- ters

Zhiyang Guo, Jinxu Xiang, Kai Ma, Wengang Zhou, Houqiang Li, and Ran Zhang. Make-It-Animatable: An ef- ficient framework for authoring animation-ready 3D charac- ters. InCVPR, 2025. 1, 3, 4, 6, 7

work page 2025
[7]

LRM: Large reconstruction model for single image to 3D

Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, and Hao Tan. LRM: Large reconstruction model for single image to 3D. InICLR, 2024. 3

work page 2024
[8]

DreamWaltz-G: Expressive 3D gaussian avatars from skeleton-guided 2D diffusion.IEEE TPAMI, 2025

Yukun Huang, Jianan Wang, Ailing Zeng, Zheng-Jun Zha, Lei Zhang, and Xihui Liu. DreamWaltz-G: Expressive 3D gaussian avatars from skeleton-guided 2D diffusion.IEEE TPAMI, 2025. 3

work page 2025
[9]

AnimaX: Animating the inan- imate in 3D with joint video-pose diffusion models.arXiv preprint arXiv:2506.19851, 2025

Zehuan Huang, Haoran Feng, Yangtian Sun, Yuanchen Guo, Yanpei Cao, and Lu Sheng. AnimaX: Animating the inan- imate in 3D with joint video-pose diffusion models.arXiv preprint arXiv:2506.19851, 2025. 3

work page arXiv 2025
[10]

LVSM: A large view synthesis model with minimal 3D inductive bias

Haian Jin, Hanwen Jiang, Hao Tan, Kai Zhang, Sai Bi, Tianyuan Zhang, Fujun Luan, Noah Snavely, and Zexiang Xu. LVSM: A large view synthesis model with minimal 3D inductive bias. InICLR, 2025. 2, 3, 6, 1

work page 2025
[11]

arXiv preprint arXiv:2508.19247 , year=

Lin Li, Zehuan Huang, Haoran Feng, Gengxiong Zhuang, Rui Chen, Chunchao Guo, and Lu Sheng. V oxhammer: Training-free precise and coherent 3D editing in native 3D space.arXiv preprint arXiv:2508.19247, 2025. 2

work page arXiv 2025
[12]

RE- LATE3D: Refocusing latent adapter for targeted local en- hancement and editing in 3D generation

Xiao-Lei Li, Hao-Xiang Chen, Yanni Zhang, Kai Ma, Alan Zhao, Tai-Jiang Mu, Hao-Xiang Guo, and Ran Zhang. RE- LATE3D: Refocusing latent adapter for targeted local en- hancement and editing in 3D generation. InProceedings of the Special Interest Group on Computer Graphics and In- teractive Techniques Conference Conference Papers, pages 1–12, 2025. 2

work page 2025
[13]

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

Yangguang Li, Zi-Xin Zou, Zexiang Liu, Dehu Wang, Yuan Liang, Zhipeng Yu, Xingchao Liu, Yuan-Chen Guo, Ding Liang, Wanli Ouyang, et al. TripoSG: High-fidelity 3d shape synthesis using large-scale rectified flow models.arXiv preprint arXiv:2502.06608, 2025. 2, 4

work page internal anchor Pith review arXiv 2025
[14]

Tingting Liao, Hongwei Yi, Yuliang Xiu, Jiaxiang Tang, Yangyi Huang, Justus Thies, and Michael J. Black. TADA! Text to animatable digital avatars. In3DV, pages 1508–1519,

work page
[15]

RigAnything: Template-free autoregressive rigging for diverse 3D assets

Isabella Liu, Zhan Xu, Wang Yifan, Hao Tan, Zexiang Xu, Xiaolong Wang, Hao Su, and Zifan Shi. RigAnything: Template-free autoregressive rigging for diverse 3D assets. ACM TOG, 44(4):1–12, 2025. 1, 3

work page 2025
[16]

Zero-1-to-3: Zero-shot one image to 3D object

Ruoshi Liu, Rundi Wu, Basile Van Hoorick, Pavel Tok- makov, Sergey Zakharov, and Carl V ondrick. Zero-1-to-3: Zero-shot one image to 3D object. InICCV, pages 9298– 9309, 2023. 2

work page 2023
[17]

Wonder3D: Sin- gle image to 3D using cross-domain diffusion

Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin, Yuan Liu, Zhiyang Dou, Lingjie Liu, Yuexin Ma, Song-Hai Zhang, Marc Habermann, Christian Theobalt, et al. Wonder3D: Sin- gle image to 3D using cross-domain diffusion. InCVPR, pages 9970–9980, 2024. 2

work page 2024
[18]

Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. SMPL: A skinned multi- person linear model.ACM TOG, 34(6):248:1–248:16, 2015. 3

work page 2015
[19]

TARig: Adaptive template- aware neural rigging for humanoid characters.Computers & Graphics, 114:158–167, 2023

Jing Ma and Dongliang Zhang. TARig: Adaptive template- aware neural rigging for humanoid characters.Computers & Graphics, 114:158–167, 2023. 1, 3

work page 2023
[20]

Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. A. Osman, Dimitrios Tzionas, and Michael J. Black. Expressive body capture: 3D hands, face, and body from a single image. InCVPR, pages 10975– 10985, 2019. 3

work page 2019
[21]

DreamFusion: Text-to-3D using 2D Diffusion

Ben Poole, Ajay Jain, Jonathan T Barron, and Ben Milden- hall. DreamFusion: Text-to-3D using 2D diffusion.arXiv preprint arXiv:2209.14988, 2022. 2

work page internal anchor Pith review Pith/arXiv arXiv 2022
[22]

Alpha wrapping with an offset.ACM TOG, 41(4):1–22, 2022

C ´edric Portaneri, Mael Rouxel-Labb ´e, Michael Hemmer, David Cohen-Steiner, and Pierre Alliez. Alpha wrapping with an offset.ACM TOG, 41(4):1–22, 2022. 6, 2

work page 2022
[23]

XCube: Large-scale 3D generative modeling using sparse voxel hierarchies

Xuanchi Ren, Jiahui Huang, Xiaohui Zeng, Ken Museth, Sanja Fidler, and Francis Williams. XCube: Large-scale 3D generative modeling using sparse voxel hierarchies. In CVPR, pages 4209–4219, 2024. 2

work page 2024
[24]

Flexible isosurface extraction for gradient-based mesh optimization.ACM TOG, 42(4):1– 16, 2023

Tianchang Shen, Jacob Munkberg, Jon Hasselgren, Kangxue Yin, Zian Wang, Wenzheng Chen, Zan Gojcic, Sanja Fidler, 9 Nicholas Sharp, and Jun Gao. Flexible isosurface extraction for gradient-based mesh optimization.ACM TOG, 42(4):1– 16, 2023. 3

work page 2023
[25]

Puppeteer: Rig and animate your 3D models

Chaoyue Song, Xiu Li, Fan Yang, Zhongcong Xu, Jiacheng Wei, Fayao Liu, Jiashi Feng, Guosheng Lin, and Jianfeng Zhang. Puppeteer: Rig and animate your 3D models. NeurIPS, 2025. 1, 3, 6, 7, 4

work page 2025
[26]

MagicArticulate: Make your 3D mod- els articulation-ready

Chaoyue Song, Jianfeng Zhang, Xiu Li, Fan Yang, Yiwen Chen, Zhongcong Xu, Jun Hao Liew, Xiaoyang Guo, Fayao Liu, Jiashi Feng, et al. MagicArticulate: Make your 3D mod- els articulation-ready. InCVPR, pages 15998–16007, 2025. 3

work page 2025
[27]

DRiVE: Diffusion-based rigging em- powers generation of versatile and expressive characters

Mingze Sun, Junhao Chen, Junting Dong, Yurun Chen, Xinyu Jiang, Shiwei Mao, Puhua Jiang, Jingbo Wang, Bo Dai, and Ruqi Huang. DRiVE: Diffusion-based rigging em- powers generation of versatile and expressive characters. In CVPR, pages 21170–21180, 2025. 1, 3

work page 2025
[28]

Splatter image: Ultra-fast single-view 3D recon- struction

Stanislaw Szymanowicz, Chrisitian Rupprecht, and Andrea Vedaldi. Splatter image: Ultra-fast single-view 3D recon- struction. InCVPR, pages 10208–10217, 2024. 3

work page 2024
[29]

LGM: Large multi-view gaussian model for high-resolution 3D content creation

Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, and Ziwei Liu. LGM: Large multi-view gaussian model for high-resolution 3D content creation. InECCV, pages 1–18. Springer, 2024. 3

work page 2024
[30]

Hunyuan3D 2.1: From images to high-fidelity 3D assets with production-ready pbr material,

Tencent Hunyuan3D Team. Hunyuan3D 2.1: From images to high-fidelity 3D assets with production-ready pbr material,

work page
[31]

Hunyuan3D-Omni: A unified framework for controllable generation of 3D assets, 2025

Tencent Hunyuan3D Team. Hunyuan3D-Omni: A unified framework for controllable generation of 3D assets, 2025. 2, 3, 6, 7

work page 2025
[32]

VGGT: Visual geometry grounded transformer

Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. VGGT: Visual geometry grounded transformer. InCVPR, pages 5294–5306, 2025. 3

work page 2025
[33]

DUSt3R: Geometric 3D vision made easy

Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. DUSt3R: Geometric 3D vision made easy. InCVPR, pages 20697–20709, 2024. 3

work page 2024
[34]

WonderHuman: Hallucinating unseen parts in dynamic 3D human reconstruction.arXiv preprint arXiv:2502.01045, 2025

Zilong Wang, Zhiyang Dou, Yuan Liu, Cheng Lin, Xiao Dong, Yunhui Guo, Chenxu Zhang, Xin Li, Wenping Wang, and Xiaohu Guo. WonderHuman: Hallucinating unseen parts in dynamic 3D human reconstruction.arXiv preprint arXiv:2502.01045, 2025. 3

work page arXiv 2025
[35]

Meshlrm: Large reconstruction model for high- quality mesh

Xinyue Wei, Kai Zhang, Sai Bi, Hao Tan, Fujun Luan, Valentin Deschaintre, Kalyan Sunkavalli, Hao Su, and Zex- iang Xu. MeshLRM: Large reconstruction model for high- quality mesh.arXiv preprint arXiv:2404.12385, 2024. 3

work page arXiv 2024
[36]

An- imateAnyMesh: A feed-forward 4D foundation model for text-driven universal mesh animation

Zijie Wu, Chaohui Yu, Fan Wang, and Xiang Bai. An- imateAnyMesh: A feed-forward 4D foundation model for text-driven universal mesh animation. InICCV, 2025. 3

work page 2025
[37]

Structured 3D latents for scalable and versatile 3D generation

Jianfeng Xiang, Zelong Lv, Sicheng Xu, Yu Deng, Ruicheng Wang, Bowen Zhang, Dong Chen, Xin Tong, and Jiaolong Yang. Structured 3D latents for scalable and versatile 3D generation. InCVPR, pages 21469–21480, 2025. 2

work page 2025
[38]

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

Jiale Xu, Weihao Cheng, Yiming Gao, Xintao Wang, Shenghua Gao, and Ying Shan. InstantMesh: Efficient 3D mesh generation from a single image with sparse-view large reconstruction models.arXiv preprint arxiv:2404.07191,

work page internal anchor Pith review Pith/arXiv arXiv
[39]

GRM: Large gaussian reconstruction model for ef- ficient 3D reconstruction and generation

Yinghao Xu, Zifan Shi, Wang Yifan, Hansheng Chen, Ceyuan Yang, Sida Peng, Yujun Shen, and Gordon Wet- zstein. GRM: Large gaussian reconstruction model for ef- ficient 3D reconstruction and generation. InECCV, pages 1–20. Springer, 2024. 3

work page 2024
[40]

RigNet: Neural rigging for articu- lated characters.ACM TOG, 39(4):58:58:1–58:58:14, 2020

Zhan Xu, Yang Zhou, Evangelos Kalogerakis, Chris Lan- dreth, and Karan Singh. RigNet: Neural rigging for articu- lated characters.ACM TOG, 39(4):58:58:1–58:58:14, 2020. 1, 3

work page 2020
[41]

arXiv preprint arXiv:2506.21076 (2025) 4, 8

Hongyu Yan, Kunming Luo, Weiyu Li, Yixun Liang, Sheng- ming Li, Jingwei Huang, Chunchao Guo, and Ping Tan. PoseMaster: Generating 3D characters in arbitrary poses from a single image.arXiv preprint arXiv:2506.21076, 2025. 2, 3

work page arXiv 2025
[42]

X-Part: high fidelity and structure coher- ent shape decomposition.arXiv preprint arXiv:2509.08643,

Xinhao Yan, Jiachen Xu, Yang Li, Changfeng Ma, Yunhan Yang, Chunshi Wang, Zibo Zhao, Zeqiang Lai, Yunfei Zhao, Zhuo Chen, et al. X-Part: high fidelity and structure coher- ent shape decomposition.arXiv preprint arXiv:2509.08643,

work page arXiv
[43]

Holopart: Generative 3d part amodal segmentation.arXiv preprint arXiv:2504.07943, 2025

Yunhan Yang, Yuan-Chen Guo, Yukun Huang, Zi-Xin Zou, Zhipeng Yu, Yangguang Li, Yan-Pei Cao, and Xihui Liu. HoloPart: Generative 3D part amodal segmentation.arXiv preprint arXiv:2504.07943, 2025. 2

work page arXiv 2025
[44]

Hi3dgen: High-fidelity 3d geometry generation from images via normal bridging.arXiv preprint arXiv:2503.22236, 3:2,

Chongjie Ye, Yushuang Wu, Ziteng Lu, Jiahao Chang, Xi- aoyang Guo, Jiaqing Zhou, Hao Zhao, and Xiaoguang Han. Hi3DGen: High-fidelity 3D geometry generation from im- ages via normal bridging.arXiv preprint arXiv:2503.22236, 3:2, 2025. 2

work page arXiv 2025
[45]

HumanRAM: Feed-forward human reconstruction and animation model using transformers

Zhiyuan Yu, Zhe Li, Hujun Bao, Can Yang, and Xiaowei Zhou. HumanRAM: Feed-forward human reconstruction and animation model using transformers. InSIGGRAPH Conference Proceedings, 2025. 3, 6

work page 2025
[46]

3DShape2VecSet: A 3D shape representation for neural fields and generative diffusion models.ACM TOG, 42 (4):92:1–92:16, 2023

Biao Zhang, Jiapeng Tang, Matthias Nießner, and Peter Wonka. 3DShape2VecSet: A 3D shape representation for neural fields and generative diffusion models.ACM TOG, 42 (4):92:1–92:16, 2023. 2, 4, 6, 1

work page 2023
[47]

Advances in feed-forward 3d reconstruction and view synthesis: A survey.arXiv preprint arXiv:2507.14501, 2025

Jiahui Zhang, Yuelei Li, Anpei Chen, Muyu Xu, Kunhao Liu, Jianyuan Wang, Xiao-Xiao Long, Hanxue Liang, Zex- iang Xu, Hao Su, et al. Advances in feed-forward 3D re- construction and view synthesis: A survey.arXiv preprint arXiv:2507.14501, 2025. 3

work page arXiv 2025
[48]

One model to rig them all: Diverse skeleton rigging with UniRig.ACM TOG, 44(4):1–18, 2025

Jia-Peng Zhang, Cheng-Feng Pu, Meng-Hao Guo, Yan-Pei Cao, and Shi-Min Hu. One model to rig them all: Diverse skeleton rigging with UniRig.ACM TOG, 44(4):1–18, 2025. 1, 3

work page 2025
[49]

GS-LRM: Large recon- struction model for 3D gaussian splatting

Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, and Zexiang Xu. GS-LRM: Large recon- struction model for 3D gaussian splatting. InECCV, pages 1–19. Springer, 2024. 3

work page 2024
[50]

CLAY: A controllable large-scale generative model for cre- ating high-quality 3D assets.ACM TOG, 43(4):1–20, 2024

Longwen Zhang, Ziyu Wang, Qixuan Zhang, Qiwei Qiu, Anqi Pang, Haoran Jiang, Wei Yang, Lan Xu, and Jingyi Yu. CLAY: A controllable large-scale generative model for cre- ating high-quality 3D assets.ACM TOG, 43(4):1–20, 2024. 2

work page 2024
[51]

post-transformer

Longwen Zhang, Qixuan Zhang, Haoran Jiang, Yinuo Bai, Wei Yang, Lan Xu, and Jingyi Yu. BANG: Dividing 3D assets via generative exploded dynamics.ACM TOG, 44(4): 1–21, 2025. 2 10 Make-It-Poseable: Feed-forward Latent Posing Model for 3D Humanoid Character Animation Supplementary Material A. Implementation Details A.1. Model Details A.1.1. Shape V AE Our 3D...

work page 2025

[1] [1]

Mixamo, 2024.https://www.mixamo.com

Adobe. Mixamo, 2024.https://www.mixamo.com. 6

work page 2024

[2] [2]

Automatic rigging and ani- mation of 3D characters.ACM TOG, 26(3):72–es, 2007

Ilya Baran and Jovan Popovi ´c. Automatic rigging and ani- mation of 3D characters.ACM TOG, 26(3):72–es, 2007. 3

work page 2007

[3] [3]

Human- Rig: Learning automatic rigging for humanoid character in a large scale dataset, 2024

Zedong Chu, Feng Xiong, Meiduo Liu, Jinzhi Zhang, Mingqi Shao, Zhaoxu Sun, Di Wang, and Mu Xu. Human- Rig: Learning automatic rigging for humanoid character in a large scale dataset, 2024. 1, 3, 6

work page 2024

[4] [4]

DetailGen3D: Generative 3D geometry enhancement via data-dependent flow, 2025

Ken Deng, Yuan-Chen Guo, Jingxiang Sun, Zi-Xin Zou, Yangguang Li, Xin Cai, Yan-Pei Cao, Yebin Liu, and Ding Liang. DetailGen3D: Generative 3D geometry enhancement via data-dependent flow, 2025. 2, 4

work page 2025

[5] [5]

Anymate: A dataset and baselines for learning 3D object rigging

Yufan Deng, Yuhao Zhang, Chen Geng, Shangzhe Wu, and Jiajun Wu. Anymate: A dataset and baselines for learning 3D object rigging. InSIGGRAPH Conference Proceedings, Vancouver, BC, Canada, 2025. Association for Computing Machinery. 3

work page 2025

[6] [6]

Make-It-Animatable: An ef- ficient framework for authoring animation-ready 3D charac- ters

Zhiyang Guo, Jinxu Xiang, Kai Ma, Wengang Zhou, Houqiang Li, and Ran Zhang. Make-It-Animatable: An ef- ficient framework for authoring animation-ready 3D charac- ters. InCVPR, 2025. 1, 3, 4, 6, 7

work page 2025

[7] [7]

LRM: Large reconstruction model for single image to 3D

Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, and Hao Tan. LRM: Large reconstruction model for single image to 3D. InICLR, 2024. 3

work page 2024

[8] [8]

DreamWaltz-G: Expressive 3D gaussian avatars from skeleton-guided 2D diffusion.IEEE TPAMI, 2025

Yukun Huang, Jianan Wang, Ailing Zeng, Zheng-Jun Zha, Lei Zhang, and Xihui Liu. DreamWaltz-G: Expressive 3D gaussian avatars from skeleton-guided 2D diffusion.IEEE TPAMI, 2025. 3

work page 2025

[9] [9]

AnimaX: Animating the inan- imate in 3D with joint video-pose diffusion models.arXiv preprint arXiv:2506.19851, 2025

Zehuan Huang, Haoran Feng, Yangtian Sun, Yuanchen Guo, Yanpei Cao, and Lu Sheng. AnimaX: Animating the inan- imate in 3D with joint video-pose diffusion models.arXiv preprint arXiv:2506.19851, 2025. 3

work page arXiv 2025

[10] [10]

LVSM: A large view synthesis model with minimal 3D inductive bias

Haian Jin, Hanwen Jiang, Hao Tan, Kai Zhang, Sai Bi, Tianyuan Zhang, Fujun Luan, Noah Snavely, and Zexiang Xu. LVSM: A large view synthesis model with minimal 3D inductive bias. InICLR, 2025. 2, 3, 6, 1

work page 2025

[11] [11]

arXiv preprint arXiv:2508.19247 , year=

Lin Li, Zehuan Huang, Haoran Feng, Gengxiong Zhuang, Rui Chen, Chunchao Guo, and Lu Sheng. V oxhammer: Training-free precise and coherent 3D editing in native 3D space.arXiv preprint arXiv:2508.19247, 2025. 2

work page arXiv 2025

[12] [12]

RE- LATE3D: Refocusing latent adapter for targeted local en- hancement and editing in 3D generation

Xiao-Lei Li, Hao-Xiang Chen, Yanni Zhang, Kai Ma, Alan Zhao, Tai-Jiang Mu, Hao-Xiang Guo, and Ran Zhang. RE- LATE3D: Refocusing latent adapter for targeted local en- hancement and editing in 3D generation. InProceedings of the Special Interest Group on Computer Graphics and In- teractive Techniques Conference Conference Papers, pages 1–12, 2025. 2

work page 2025

[13] [13]

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

Yangguang Li, Zi-Xin Zou, Zexiang Liu, Dehu Wang, Yuan Liang, Zhipeng Yu, Xingchao Liu, Yuan-Chen Guo, Ding Liang, Wanli Ouyang, et al. TripoSG: High-fidelity 3d shape synthesis using large-scale rectified flow models.arXiv preprint arXiv:2502.06608, 2025. 2, 4

work page internal anchor Pith review arXiv 2025

[14] [14]

Tingting Liao, Hongwei Yi, Yuliang Xiu, Jiaxiang Tang, Yangyi Huang, Justus Thies, and Michael J. Black. TADA! Text to animatable digital avatars. In3DV, pages 1508–1519,

work page

[15] [15]

RigAnything: Template-free autoregressive rigging for diverse 3D assets

Isabella Liu, Zhan Xu, Wang Yifan, Hao Tan, Zexiang Xu, Xiaolong Wang, Hao Su, and Zifan Shi. RigAnything: Template-free autoregressive rigging for diverse 3D assets. ACM TOG, 44(4):1–12, 2025. 1, 3

work page 2025

[16] [16]

Zero-1-to-3: Zero-shot one image to 3D object

Ruoshi Liu, Rundi Wu, Basile Van Hoorick, Pavel Tok- makov, Sergey Zakharov, and Carl V ondrick. Zero-1-to-3: Zero-shot one image to 3D object. InICCV, pages 9298– 9309, 2023. 2

work page 2023

[17] [17]

Wonder3D: Sin- gle image to 3D using cross-domain diffusion

Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin, Yuan Liu, Zhiyang Dou, Lingjie Liu, Yuexin Ma, Song-Hai Zhang, Marc Habermann, Christian Theobalt, et al. Wonder3D: Sin- gle image to 3D using cross-domain diffusion. InCVPR, pages 9970–9980, 2024. 2

work page 2024

[18] [18]

Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. SMPL: A skinned multi- person linear model.ACM TOG, 34(6):248:1–248:16, 2015. 3

work page 2015

[19] [19]

TARig: Adaptive template- aware neural rigging for humanoid characters.Computers & Graphics, 114:158–167, 2023

Jing Ma and Dongliang Zhang. TARig: Adaptive template- aware neural rigging for humanoid characters.Computers & Graphics, 114:158–167, 2023. 1, 3

work page 2023

[20] [20]

Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. A. Osman, Dimitrios Tzionas, and Michael J. Black. Expressive body capture: 3D hands, face, and body from a single image. InCVPR, pages 10975– 10985, 2019. 3

work page 2019

[21] [21]

DreamFusion: Text-to-3D using 2D Diffusion

Ben Poole, Ajay Jain, Jonathan T Barron, and Ben Milden- hall. DreamFusion: Text-to-3D using 2D diffusion.arXiv preprint arXiv:2209.14988, 2022. 2

work page internal anchor Pith review Pith/arXiv arXiv 2022

[22] [22]

Alpha wrapping with an offset.ACM TOG, 41(4):1–22, 2022

C ´edric Portaneri, Mael Rouxel-Labb ´e, Michael Hemmer, David Cohen-Steiner, and Pierre Alliez. Alpha wrapping with an offset.ACM TOG, 41(4):1–22, 2022. 6, 2

work page 2022

[23] [23]

XCube: Large-scale 3D generative modeling using sparse voxel hierarchies

Xuanchi Ren, Jiahui Huang, Xiaohui Zeng, Ken Museth, Sanja Fidler, and Francis Williams. XCube: Large-scale 3D generative modeling using sparse voxel hierarchies. In CVPR, pages 4209–4219, 2024. 2

work page 2024

[24] [24]

Flexible isosurface extraction for gradient-based mesh optimization.ACM TOG, 42(4):1– 16, 2023

Tianchang Shen, Jacob Munkberg, Jon Hasselgren, Kangxue Yin, Zian Wang, Wenzheng Chen, Zan Gojcic, Sanja Fidler, 9 Nicholas Sharp, and Jun Gao. Flexible isosurface extraction for gradient-based mesh optimization.ACM TOG, 42(4):1– 16, 2023. 3

work page 2023

[25] [25]

Puppeteer: Rig and animate your 3D models

Chaoyue Song, Xiu Li, Fan Yang, Zhongcong Xu, Jiacheng Wei, Fayao Liu, Jiashi Feng, Guosheng Lin, and Jianfeng Zhang. Puppeteer: Rig and animate your 3D models. NeurIPS, 2025. 1, 3, 6, 7, 4

work page 2025

[26] [26]

MagicArticulate: Make your 3D mod- els articulation-ready

Chaoyue Song, Jianfeng Zhang, Xiu Li, Fan Yang, Yiwen Chen, Zhongcong Xu, Jun Hao Liew, Xiaoyang Guo, Fayao Liu, Jiashi Feng, et al. MagicArticulate: Make your 3D mod- els articulation-ready. InCVPR, pages 15998–16007, 2025. 3

work page 2025

[27] [27]

DRiVE: Diffusion-based rigging em- powers generation of versatile and expressive characters

Mingze Sun, Junhao Chen, Junting Dong, Yurun Chen, Xinyu Jiang, Shiwei Mao, Puhua Jiang, Jingbo Wang, Bo Dai, and Ruqi Huang. DRiVE: Diffusion-based rigging em- powers generation of versatile and expressive characters. In CVPR, pages 21170–21180, 2025. 1, 3

work page 2025

[28] [28]

Splatter image: Ultra-fast single-view 3D recon- struction

Stanislaw Szymanowicz, Chrisitian Rupprecht, and Andrea Vedaldi. Splatter image: Ultra-fast single-view 3D recon- struction. InCVPR, pages 10208–10217, 2024. 3

work page 2024

[29] [29]

LGM: Large multi-view gaussian model for high-resolution 3D content creation

Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, and Ziwei Liu. LGM: Large multi-view gaussian model for high-resolution 3D content creation. InECCV, pages 1–18. Springer, 2024. 3

work page 2024

[30] [30]

Hunyuan3D 2.1: From images to high-fidelity 3D assets with production-ready pbr material,

Tencent Hunyuan3D Team. Hunyuan3D 2.1: From images to high-fidelity 3D assets with production-ready pbr material,

work page

[31] [31]

Hunyuan3D-Omni: A unified framework for controllable generation of 3D assets, 2025

Tencent Hunyuan3D Team. Hunyuan3D-Omni: A unified framework for controllable generation of 3D assets, 2025. 2, 3, 6, 7

work page 2025

[32] [32]

VGGT: Visual geometry grounded transformer

Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. VGGT: Visual geometry grounded transformer. InCVPR, pages 5294–5306, 2025. 3

work page 2025

[33] [33]

DUSt3R: Geometric 3D vision made easy

Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. DUSt3R: Geometric 3D vision made easy. InCVPR, pages 20697–20709, 2024. 3

work page 2024

[34] [34]

WonderHuman: Hallucinating unseen parts in dynamic 3D human reconstruction.arXiv preprint arXiv:2502.01045, 2025

Zilong Wang, Zhiyang Dou, Yuan Liu, Cheng Lin, Xiao Dong, Yunhui Guo, Chenxu Zhang, Xin Li, Wenping Wang, and Xiaohu Guo. WonderHuman: Hallucinating unseen parts in dynamic 3D human reconstruction.arXiv preprint arXiv:2502.01045, 2025. 3

work page arXiv 2025

[35] [35]

Meshlrm: Large reconstruction model for high- quality mesh

Xinyue Wei, Kai Zhang, Sai Bi, Hao Tan, Fujun Luan, Valentin Deschaintre, Kalyan Sunkavalli, Hao Su, and Zex- iang Xu. MeshLRM: Large reconstruction model for high- quality mesh.arXiv preprint arXiv:2404.12385, 2024. 3

work page arXiv 2024

[36] [36]

An- imateAnyMesh: A feed-forward 4D foundation model for text-driven universal mesh animation

Zijie Wu, Chaohui Yu, Fan Wang, and Xiang Bai. An- imateAnyMesh: A feed-forward 4D foundation model for text-driven universal mesh animation. InICCV, 2025. 3

work page 2025

[37] [37]

Structured 3D latents for scalable and versatile 3D generation

Jianfeng Xiang, Zelong Lv, Sicheng Xu, Yu Deng, Ruicheng Wang, Bowen Zhang, Dong Chen, Xin Tong, and Jiaolong Yang. Structured 3D latents for scalable and versatile 3D generation. InCVPR, pages 21469–21480, 2025. 2

work page 2025

[38] [38]

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

Jiale Xu, Weihao Cheng, Yiming Gao, Xintao Wang, Shenghua Gao, and Ying Shan. InstantMesh: Efficient 3D mesh generation from a single image with sparse-view large reconstruction models.arXiv preprint arxiv:2404.07191,

work page internal anchor Pith review Pith/arXiv arXiv

[39] [39]

GRM: Large gaussian reconstruction model for ef- ficient 3D reconstruction and generation

Yinghao Xu, Zifan Shi, Wang Yifan, Hansheng Chen, Ceyuan Yang, Sida Peng, Yujun Shen, and Gordon Wet- zstein. GRM: Large gaussian reconstruction model for ef- ficient 3D reconstruction and generation. InECCV, pages 1–20. Springer, 2024. 3

work page 2024

[40] [40]

RigNet: Neural rigging for articu- lated characters.ACM TOG, 39(4):58:58:1–58:58:14, 2020

Zhan Xu, Yang Zhou, Evangelos Kalogerakis, Chris Lan- dreth, and Karan Singh. RigNet: Neural rigging for articu- lated characters.ACM TOG, 39(4):58:58:1–58:58:14, 2020. 1, 3

work page 2020

[41] [41]

arXiv preprint arXiv:2506.21076 (2025) 4, 8

Hongyu Yan, Kunming Luo, Weiyu Li, Yixun Liang, Sheng- ming Li, Jingwei Huang, Chunchao Guo, and Ping Tan. PoseMaster: Generating 3D characters in arbitrary poses from a single image.arXiv preprint arXiv:2506.21076, 2025. 2, 3

work page arXiv 2025

[42] [42]

X-Part: high fidelity and structure coher- ent shape decomposition.arXiv preprint arXiv:2509.08643,

Xinhao Yan, Jiachen Xu, Yang Li, Changfeng Ma, Yunhan Yang, Chunshi Wang, Zibo Zhao, Zeqiang Lai, Yunfei Zhao, Zhuo Chen, et al. X-Part: high fidelity and structure coher- ent shape decomposition.arXiv preprint arXiv:2509.08643,

work page arXiv

[43] [43]

Holopart: Generative 3d part amodal segmentation.arXiv preprint arXiv:2504.07943, 2025

Yunhan Yang, Yuan-Chen Guo, Yukun Huang, Zi-Xin Zou, Zhipeng Yu, Yangguang Li, Yan-Pei Cao, and Xihui Liu. HoloPart: Generative 3D part amodal segmentation.arXiv preprint arXiv:2504.07943, 2025. 2

work page arXiv 2025

[44] [44]

Hi3dgen: High-fidelity 3d geometry generation from images via normal bridging.arXiv preprint arXiv:2503.22236, 3:2,

Chongjie Ye, Yushuang Wu, Ziteng Lu, Jiahao Chang, Xi- aoyang Guo, Jiaqing Zhou, Hao Zhao, and Xiaoguang Han. Hi3DGen: High-fidelity 3D geometry generation from im- ages via normal bridging.arXiv preprint arXiv:2503.22236, 3:2, 2025. 2

work page arXiv 2025

[45] [45]

HumanRAM: Feed-forward human reconstruction and animation model using transformers

Zhiyuan Yu, Zhe Li, Hujun Bao, Can Yang, and Xiaowei Zhou. HumanRAM: Feed-forward human reconstruction and animation model using transformers. InSIGGRAPH Conference Proceedings, 2025. 3, 6

work page 2025

[46] [46]

3DShape2VecSet: A 3D shape representation for neural fields and generative diffusion models.ACM TOG, 42 (4):92:1–92:16, 2023

Biao Zhang, Jiapeng Tang, Matthias Nießner, and Peter Wonka. 3DShape2VecSet: A 3D shape representation for neural fields and generative diffusion models.ACM TOG, 42 (4):92:1–92:16, 2023. 2, 4, 6, 1

work page 2023

[47] [47]

Advances in feed-forward 3d reconstruction and view synthesis: A survey.arXiv preprint arXiv:2507.14501, 2025

Jiahui Zhang, Yuelei Li, Anpei Chen, Muyu Xu, Kunhao Liu, Jianyuan Wang, Xiao-Xiao Long, Hanxue Liang, Zex- iang Xu, Hao Su, et al. Advances in feed-forward 3D re- construction and view synthesis: A survey.arXiv preprint arXiv:2507.14501, 2025. 3

work page arXiv 2025

[48] [48]

One model to rig them all: Diverse skeleton rigging with UniRig.ACM TOG, 44(4):1–18, 2025

Jia-Peng Zhang, Cheng-Feng Pu, Meng-Hao Guo, Yan-Pei Cao, and Shi-Min Hu. One model to rig them all: Diverse skeleton rigging with UniRig.ACM TOG, 44(4):1–18, 2025. 1, 3

work page 2025

[49] [49]

GS-LRM: Large recon- struction model for 3D gaussian splatting

Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, and Zexiang Xu. GS-LRM: Large recon- struction model for 3D gaussian splatting. InECCV, pages 1–19. Springer, 2024. 3

work page 2024

[50] [50]

CLAY: A controllable large-scale generative model for cre- ating high-quality 3D assets.ACM TOG, 43(4):1–20, 2024

Longwen Zhang, Ziyu Wang, Qixuan Zhang, Qiwei Qiu, Anqi Pang, Haoran Jiang, Wei Yang, Lan Xu, and Jingyi Yu. CLAY: A controllable large-scale generative model for cre- ating high-quality 3D assets.ACM TOG, 43(4):1–20, 2024. 2

work page 2024

[51] [51]

post-transformer

Longwen Zhang, Qixuan Zhang, Haoran Jiang, Yinuo Bai, Wei Yang, Lan Xu, and Jingyi Yu. BANG: Dividing 3D assets via generative exploded dynamics.ACM TOG, 44(4): 1–21, 2025. 2 10 Make-It-Poseable: Feed-forward Latent Posing Model for 3D Humanoid Character Animation Supplementary Material A. Implementation Details A.1. Model Details A.1.1. Shape V AE Our 3D...

work page 2025