pith. sign in

arxiv: 2604.06467 · v1 · submitted 2026-04-07 · 💻 cs.CV

PhysHead: Simulation-Ready Gaussian Head Avatars

Pith reviewed 2026-05-10 18:43 UTC · model grok-4.3

classification 💻 cs.CV
keywords head avatarsGaussian splattinghair dynamicsphysics simulationanimatable avatarsmulti-view videostrand-based hairvolumetric representation
0
0 comments X

The pith

PhysHead creates head avatars with realistic physics-simulated hair motion from multi-view video using a layered Gaussian representation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops PhysHead to fix the common problem where head avatars treat hair as a rigid outer shell that does not move naturally. It builds a hybrid model that places 3D Gaussian primitives on both a parametric head mesh and separate hair strands, so the strands can be fed directly into physics engines for animation. This produces effects such as wind-blown hair while the face expresses and the viewpoint changes. Vision-language models fill in the appearance of hair regions hidden during capture. Readers would care because current digital humans lose realism the moment hair moves, limiting applications in games, VR, and film.

Core claim

We introduce PhysHead, a hybrid representation for animatable head avatars with realistic hair dynamics learned from multi-view video. At the core is a 3D Gaussian-based layered representation of the head that combines a 3D parametric mesh for the head with strand-based hair, which can be directly simulated using physics engines. Gaussian primitives are attached to both the head mesh and hair segments for photorealistic appearance. This enables creation of photorealistic head avatars with dynamic hair behavior such as wind-blown motion. We propose the use of VLM-based models to generate appearance of regions that are occluded in the dynamic training sequences.

What carries the argument

hybrid 3D Gaussian-based layered representation that attaches Gaussian primitives to a parametric head mesh and strand-based hair for both appearance modeling and direct physics simulation

If this is right

  • Physically plausible hair motion can be synthesized in addition to expression and camera control.
  • Dynamic behaviors such as wind-blown hair become possible without assuming rigid movement.
  • Hair strands support direct simulation inside standard physics engines.
  • Photorealistic appearance is maintained across head and hair through attached Gaussian primitives.
  • Quantitative and qualitative evaluations show advantages over rigid-hair baselines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same layered approach could be tested on other moving elements such as clothing or loose accessories for consistent physics.
  • Occlusion handling via vision-language models may apply to reconstruction of other dynamic scenes with partial visibility.
  • Integration with full-body avatars would allow hair to interact physically with body motion and environment objects.

Load-bearing premise

Vision-language models can reliably synthesize accurate appearance for occluded hair regions without introducing artifacts that break photorealism under physics simulation.

What would settle it

Animate the hair under novel external forces such as strong wind or rapid head turns not present in the training videos and check whether previously occluded strands render without visible artifacts or loss of realism.

Figures

Figures reproduced from arXiv: 2604.06467 by Berna Kabadayi, Gerard Pons-Moll, Justus Thies, Vanessa Sklyarova, Wojciech Zielonka.

Figure 1
Figure 1. Figure 1: We introduce PhysHead, a novel method for creating photorealistic head avatars with dynamic hair from multi-view videos. We [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview. PhysHead reconstructs an animatable 3D human head avatar (e) from a multiview input video. It is based on a 3D Gaussian appearance representation that is split into a face (c) and a hair region (d). The face region uses 3D Gaussians that are attached to a 3DMM-based mesh (FLAME [43]), which allows for parametric facial expression as well as head pose control (a,c). To enable physics￾based animati… view at source ↗
Figure 4
Figure 4. Figure 4: Using strand-skinning [10], the dense strands are af￾fected by k = 10 sparse strands. The skinning weights are in￾versely proportional to the distance from the sparse hair strands. Specifically, if d1 is smaller than d2 in the figure, then ω1 is larger than ω2. An iterative constraint solver is utilized to resolve collisions and to enforce joint or contact constraints, ensuring a stable and realistic simul… view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative Comparison All methods synthesize pho￾torealistic avatars. However, both GHA [97] and GA [63] have rigid hair when animated, as Gaussians are unstructured. In con￾trast, our method, PhysHead, benefitting from strand representa￾tion, generalizes under novel driving signals. Gaussian Head Avatars (GHA) [97] is a Gaussian-based head avatar approach comprising two stages. First, a mesh head is opti… view at source ↗
Figure 6
Figure 6. Figure 6: GaussianHaircut [100] comparison. As can be seen, GaussianHaircut is not yet suitable for simulation with learned appearance, as some hair strands penetrate the head and are assigned skin colors. In contrast, our reconstructed hair strands exhibit more consistent coloring due to the proposed consistency loss and can be used directly in simulation. 4.1. Comparison Hair reconstruction evaluation. We evaluate… view at source ↗
Figure 9
Figure 9. Figure 9: The strand-based hair layer can be edited by the user to [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗
Figure 7
Figure 7. Figure 7: HairCUP [32] comparison. Since haircup uses SDS, it exhibits artifacts and has tendency to generate the same type of skin color. In contrast, we use VLM model for image generation which is trained on a lot of data. This helps with generalization to different skin colors. Despite being compositional, it has un￾structured gaussians which does not allow for physics simulation. For more results, see supp mat. … view at source ↗
Figure 8
Figure 8. Figure 8: The consistency loss helps to assign meaningful colors [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
Figure 11
Figure 11. Figure 11: Analysis of the riggidity of the hair geometry on a [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Extended HairCUP [32] comparison. HairCUP im￾ages are sourced from the original publication. As shown, Hair￾CUP may produce artifacts and exhibits limited skin-tone variabil￾ity across subjects, while our method mitigates these issues. tion loss significantly improves the realism of hair, partic￾ularly its internal structure. Finally, we present additional comparisons of dynamic hair appearance with Gauss… view at source ↗
Figure 15
Figure 15. Figure 15: Single layer animation. Single layer of optimiza￾tion results in artifacts during animation despite the usage of hair strands. 10.4. Limitations PhysHead operates on reconstructed strand geometry. In our experiments, we utilize NH [72], which fails to recon￾struct curly hair, shown in [PITH_FULL_IMAGE:figures/full_fig_p017_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Curly Hair Curly Hair results with GT and Ours [PITH_FULL_IMAGE:figures/full_fig_p017_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Extended Gaussian Haircut (GH) [100] comparison. Gaussian Haircut captures the overall outer hair color nicely but exhibits artifacts in invisible regions. Our method propagates the outer color into inner strands, enabling consistent hair appearance [PITH_FULL_IMAGE:figures/full_fig_p018_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Additional VLM-Based Bald Image Generation Results [PITH_FULL_IMAGE:figures/full_fig_p019_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Additional VLM-Based Bald Image Generation Results [PITH_FULL_IMAGE:figures/full_fig_p020_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Additional VLM-Based Bald Image Generation Results [PITH_FULL_IMAGE:figures/full_fig_p021_20.png] view at source ↗
read the original abstract

Realistic digital avatars require expressive and dynamic hair motion; however, most existing head avatar methods assume rigid hair movement. These methods often fail to disentangle hair from the head, representing it as a simple outer shell and failing to capture its natural volumetric behavior. In this paper, we address these limitations by introducing PhysHead, a hybrid representation for animatable head avatars with realistic hair dynamics learned from multi-view video. At the core is a 3D Gaussian-based layered representation of the head. Our approach combines a 3D parametric mesh for the head with strand-based hair, which can be directly simulated using physics engines. For the appearance model, we employ Gaussian primitives attached to both the head mesh and hair segments. This representation enables the creation of photorealistic head avatars with dynamic hair behavior, such as wind-blown motion, overcoming the constraints of rigid hair in existing methods. However, these animation capabilities also require new training schemes. In particular, we propose the use of VLM-based models to generate appearance of regions that are occluded in the dynamic training sequences. In quantitative and qualitative studies, we demonstrate the capabilities of the proposed model and compare it with existing baselines. We show that our method can synthesize physically plausible hair motion besides expression and camera control.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces PhysHead, a hybrid 3D Gaussian representation for animatable head avatars that integrates a parametric head mesh with strand-based hair. Gaussian primitives are attached to the mesh and hair strands, enabling direct simulation of hair dynamics using physics engines. The approach uses vision-language models (VLMs) to synthesize appearance for regions occluded during dynamic training sequences. It claims to achieve photorealistic results with expression control, camera control, and physically plausible hair motion (e.g., wind-blown), demonstrated through quantitative and qualitative studies comparing to baselines.

Significance. If the claims hold, this work represents a meaningful advance in digital avatar technology by moving beyond rigid hair assumptions to enable dynamic, physics-based hair behavior while maintaining photorealism. The simulation-ready nature of the representation is a notable strength, as it allows integration with standard physics engines without additional post-processing. This could have broad impact in VR/AR, animation, and telepresence applications where realistic hair dynamics are essential.

major comments (2)
  1. Abstract: The central claim that the method synthesizes 'physically plausible hair motion' relies on VLM-generated appearance for occluded regions remaining consistent under novel poses and viewpoints generated by the physics engine. No quantitative validation (e.g., perceptual metrics, artifact counts, or before/after simulation comparisons) is referenced to confirm this, which is load-bearing because mismatches would produce visible artifacts precisely when strands move and expose new surfaces.
  2. Method section (hybrid representation): The attachment of Gaussian primitives to hair segments and the mechanism by which their positions, covariances, and colors update during physics simulation are described at a high level but lack explicit equations or deformation rules. This makes it impossible to verify whether the appearance model is truly simulation-ready or requires implicit refitting, directly affecting the 'simulation-ready' and photorealism claims.
minor comments (2)
  1. Abstract: The phrasing 'besides expression and camera control' is slightly unclear; rephrasing to 'in addition to expression and camera control' would improve readability.
  2. Abstract: While quantitative and qualitative studies are mentioned, the abstract itself contains no specific metrics, dataset details, or baseline names, which reduces the immediate impact of the claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment point by point below, indicating where revisions will be made to strengthen the presentation and claims.

read point-by-point responses
  1. Referee: Abstract: The central claim that the method synthesizes 'physically plausible hair motion' relies on VLM-generated appearance for occluded regions remaining consistent under novel poses and viewpoints generated by the physics engine. No quantitative validation (e.g., perceptual metrics, artifact counts, or before/after simulation comparisons) is referenced to confirm this, which is load-bearing because mismatches would produce visible artifacts precisely when strands move and expose new surfaces.

    Authors: We agree that the consistency of VLM-generated appearances under novel poses and viewpoints induced by physics simulation is critical to the central claim. The current manuscript supports photorealism and plausibility primarily through qualitative comparisons and visual demonstrations of wind-blown and expression-driven motion. To directly address this concern, we will add quantitative validation in the revised version, including perceptual metrics (e.g., LPIPS, FID) on rendered frames from simulated sequences and before/after comparisons to quantify consistency and artifact rates. revision: yes

  2. Referee: Method section (hybrid representation): The attachment of Gaussian primitives to hair segments and the mechanism by which their positions, covariances, and colors update during physics simulation are described at a high level but lack explicit equations or deformation rules. This makes it impossible to verify whether the appearance model is truly simulation-ready or requires implicit refitting, directly affecting the 'simulation-ready' and photorealism claims.

    Authors: The hybrid representation attaches Gaussian primitives rigidly to the underlying head mesh vertices and hair strand segments. During simulation, primitive positions are updated by transforming them according to the new strand positions and orientations produced by the physics engine, while covariances and colors remain fixed (as they are optimized per-primitive during training and do not require refitting). We acknowledge that the current description is high-level. In the revision we will add explicit equations for the attachment mapping and the deformation update rules to make the simulation-ready property fully verifiable. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external established components

full rationale

The paper's core contribution is a hybrid representation (parametric mesh + strand-based hair + attached 3D Gaussians) trained from multi-view video, with VLM-based synthesis for occluded regions and direct use of physics engines for simulation. No equations, predictions, or central claims reduce by construction to fitted parameters defined inside the paper or to self-citations whose validity depends on the present work. All load-bearing elements (Gaussian splatting, strand simulation, VLM inpainting) are imported from prior independent literature. The method is therefore self-contained against external benchmarks and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

Based solely on the abstract, the central claim rests on standard domain assumptions of multi-view consistency and the effectiveness of attaching Gaussians to simulatable strands; no free parameters or invented entities are quantified in the provided text.

axioms (2)
  • domain assumption Multi-view video sequences contain enough information to reconstruct both geometry and appearance for head and hair
    Invoked implicitly when training the layered representation from video.
  • domain assumption Physics engines can produce plausible hair motion once strands are attached to the head mesh
    Required for the simulation-ready claim.
invented entities (1)
  • Layered 3D Gaussian head representation with attached hair strands no independent evidence
    purpose: To disentangle hair for independent physics simulation while preserving photorealistic appearance
    Newly proposed hybrid model described in the abstract.

pith-pipeline@v0.9.0 · 5535 in / 1432 out tokens · 38139 ms · 2026-05-10T18:43:07.465383+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

115 extracted references · 115 canonical work pages · 2 internal anchors

  1. [1]

    Ogras, and Linjie Luo

    Sizhe An, Hongyi Xu, Yichun Shi, Guoxian Song, Umit Y . Ogras, and Linjie Luo. PanoHead: Geometry-aware 3D full-head synthesis in 360°.CVPR, pages 20950–20959,

  2. [2]

    Scaffoldavatar: High-fidelity gaussian avatars with patch expressions

    Shivangi Aneja, Sebastian Weiss, Irene Baeza, Prashanth Chandran, Gaspard Zoss, Matthias Niessner, and Derek Bradley. Scaffoldavatar: High-fidelity gaussian avatars with patch expressions. InProceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers, pages 1–11, 2025. 2

  3. [3]

    RigNeRF: Fully control- lable neural 3d portraits

    ShahRukh Athar, Zexiang Xu, Kalyan Sunkavalli, Eli Shechtman, and Zhixin Shu. RigNeRF: Fully control- lable neural 3d portraits. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20364–20373, 2022. 2

  4. [4]

    FLAME-in-NeRF: Neural control of radiance fields for free view face animation

    ShahRukh Athar, Zhixin Shu, and Dimitris Samaras. FLAME-in-NeRF: Neural control of radiance fields for free view face animation. InIEEE International Conference on Automatic Face and Gesture Recognition (FG), 2023. 2

  5. [5]

    Autodesk, INC. Maya. 2, 5

  6. [6]

    A morphable model for the synthesis of 3D faces

    V olker Blanz and Thomas Vetter. A morphable model for the synthesis of 3D faces. InSIGGRAPH, pages 187–194,

  7. [7]

    Pega- sus: Personalized generative 3d avatars with composable attributes

    Hyunsoo Cha, Byungjun Kim, and Hanbyul Joo. Pega- sus: Personalized generative 3d avatars with composable attributes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1072–1081, 2024. 3

  8. [8]

    Perse: Per- sonalized 3d generative avatars from a single portrait

    Hyunsoo Cha, Inhee Lee, and Hanbyul Joo. Perse: Per- sonalized 3d generative avatars from a single portrait. In Proceedings of the Computer Vision and Pattern Recogni- tion Conference (CVPR), pages 15953–15962, 2025. 3

  9. [9]

    Adap- tive skinning for interactive hair-solid simulation.IEEE transactions on visualization and computer graphics, 23 (7):1725–1738, 2016

    Menglei Chai, Changxi Zheng, and Kun Zhou. Adap- tive skinning for interactive hair-solid simulation.IEEE transactions on visualization and computer graphics, 23 (7):1725–1738, 2016. 5

  10. [10]

    Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

    Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, et al. Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities.arXiv preprint arXiv:2507.06261, 2025. 2, 4

  11. [11]

    Bullet physics simulation

    Erwin Coumans. Bullet physics simulation. InACM SIG- GRAPH 2015 Courses, page 1, 2015. 2, 5

  12. [12]

    Interactive hair simulation on the gpu using admm.ACM SIGGRAPH 2023 Conference Proceedings,

    Gilles Daviet. Interactive hair simulation on the gpu using admm.ACM SIGGRAPH 2023 Conference Proceedings,

  13. [13]

    Bernhard Egger, William A. P. Smith, Ayush Tewari, Ste- fanie Wuhrer, Michael Zollhoefer, Thabo Beeler, Florian Bernard, Timo Bolkart, Adam Kortylewski, Sami Romd- hani, Christian Theobalt, V olker Blanz, and Thomas Vetter. 3D morphable face models—past, present, and future.ACM TOG, 39(5), 2020. 2

  14. [14]

    Unreal engine

    Epic Games. Unreal engine. 2, 3

  15. [15]

    Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps,

    Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, De- jia Xu, and Zhangyang Wang. Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps,

  16. [16]

    Yao Feng, Weiyang Liu, Timo Bolkart, Jinlong Yang, Marc Pollefeys, and Michael J. Black. Learning disentangled avatars with hybrid 3d representations.arXiv, 2023. 1, 3

  17. [17]

    Gaussian splashing: Dynamic fluid synthesis with gaussian splatting, 2024

    Yutao Feng, Xiang Feng, Yintong Shang, Ying Jiang, Chang Yu, Zeshun Zong, Tianjia Shao, Hongzhi Wu, Kun Zhou, Chenfanfu Jiang, and Yin Yang. Gaussian splashing: Dynamic fluid synthesis with gaussian splatting, 2024. 2

  18. [18]

    Dynamic neural radiance fields for monocular 4D facial avatar reconstruction.CVPR, pages 8645–8654,

    Guy Gafni, Justus Thies, Michael Zollhofer, and Matthias Nießner. Dynamic neural radiance fields for monocular 4D facial avatar reconstruction.CVPR, pages 8645–8654,

  19. [19]

    Srini- vasan, Jonathan T

    Ruiqi Gao*, Aleksander Holynski*, Philipp Henzler, Arthur Brussee, Ricardo Martin-Brualla, Pratul P. Srini- vasan, Jonathan T. Barron, and Ben Poole*. Cat3d: Create anything in 3d with multi-view diffusion models.Advances in Neural Information Processing Systems, 2024. 3

  20. [20]

    Reconstructing per- sonalized semantic facial nerf models from monocular video.ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), 41(6), 2022

    Xuan Gao, Chenglai Zhong, Jun Xiang, Yang Hong, Yudong Guo, and Juyong Zhang. Reconstructing per- sonalized semantic facial nerf models from monocular video.ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), 41(6), 2022. 2

  21. [21]

    Garbin, Marek Kowalski, Virginia Estellers, Stanislaw Szymanowicz, Shideh Rezaeifar, Jingjing Shen, Matthew Johnson, and Julien Valentin

    Stephan J. Garbin, Marek Kowalski, Virginia Estellers, Stanislaw Szymanowicz, Shideh Rezaeifar, Jingjing Shen, Matthew Johnson, and Julien Valentin. V olTeMorph: Real- time, controllable and generalizable animation of volumet- ric representations.Computer Graphics Forum, 2024. To appear. 2

  22. [22]

    Npga: Neural paramet- ric gaussian avatars

    Simon Giebenhain, Tobias Kirschstein, Martin R ¨unz, Lour- des Agapito, and Matthias Nießner. Npga: Neural paramet- ric gaussian avatars. InSIGGRAPH Asia 2024 Conference Papers, page 1–11. ACM, 2024. 1, 2

  23. [23]

    Goral, Kenneth E

    Cindy M. Goral, Kenneth E. Torrance, Donald P. Green- berg, and Bennett Battaile. Modeling the interaction of light between diffuse surfaces.SIGGRAPH, 1984. 3

  24. [24]

    Digital salon: An ai and physics-driven tool for 3d hair grooming and simulation.ACM SIGGRAPH Asia 2024 Real-Time Live!, 2024

    Chengan He, Jorge Alejandro Amador Herrera, Yi Zhou, Zhixin Shu, Xin Sun, Yao Feng, S ¨oren Pirk, Dominik L Michels, Meng Zhang, Tuanfeng Y Wang, and Holly Rush- meier. Digital salon: An ai and physics-driven tool for 3d hair grooming and simulation.ACM SIGGRAPH Asia 2024 Real-Time Live!, 2024. 3

  25. [25]

    3dgh: 3d head generation with composable hair and face.ACM Transactions on Graphics, 44(4):1–12, 2025

    Chengan He, Junxuan Li, Tobias Kirschstein, Artem Sev- astopolsky, Shunsuke Saito, Qingyang Tan, Javier Romero, Chen Cao, Holly Rushmeier, and Giljoo Nam. 3dgh: 3d head generation with composable hair and face.ACM Transactions on Graphics, 44(4):1–12, 2025. 3

  26. [26]

    Michels, Tu- anfeng Y

    Chengan He, Xin Sun, Zhixin Shu, Fujun Luan, S ¨oren Pirk, Jorge Alejandro Amador Herrera, Dominik L. Michels, Tu- anfeng Y . Wang, Meng Zhang, Holly Rushmeier, and Yi Zhou. Perm: A parametric representation for multi-style 3d hair modeling. InInternational Conference on Learning Representations, 2025. 3

  27. [27]

    Gan-avatar: Control- lable personalized gan-based human head avatar

    Berna Kabadayi, Wojciech Zielonka, Bharat Lal Bhatnagar, Gerard Pons-Moll, and Justus Thies. Gan-avatar: Control- lable personalized gan-based human head avatar. In2024 International Conference on 3D Vision (3DV), pages 882–

  28. [28]

    CoNeRF: Control- lable Neural Radiance Fields

    Kacper Kania, Kwang Moo Yi, Marek Kowalski, Tomasz Trzci´nski, and Andrea Tagliasacchi. CoNeRF: Control- lable Neural Radiance Fields. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 20314–20323, 2022

  29. [29]

    Garbin, Andrea Tagliasacchi, Vir- ginia Estellers, Kwang Moo Yi, Julien Valentin, Tomasz Trzci´nski, and Marek Kowalski

    Kacper Kania, Stephan J. Garbin, Andrea Tagliasacchi, Vir- ginia Estellers, Kwang Moo Yi, Julien Valentin, Tomasz Trzci´nski, and Marek Kowalski. BlendFields: Few-Shot Example-Driven Facial Modeling. InProceedings of the IEEE Conference on Computer Vision and Pattern Recog- nition, 2023. 2

  30. [30]

    3D gaussian splatting for real-time radiance field rendering.ACM TOG, 42:1 – 14, 2023

    Bernhard Kerbl, Georgios Kopanas, Thomas Leimkuehler, and George Drettakis. 3D gaussian splatting for real-time radiance field rendering.ACM TOG, 42:1 – 14, 2023. 2, 3, 4

  31. [31]

    Haircup: Hair compositional universal prior for 3d gaussian avatars.arXiv preprint arXiv:2507.19481,

    Byungjun Kim, Shunsuke Saito, Giljoo Nam, Tomas Si- mon, Jason Saragih, Hanbyul Joo, and Junxuan Li. Hair- cup: Hair compositional universal prior for 3d gaussian avatars.arXiv preprint arXiv:2507.19481, 2025. 1, 2, 3, 5, 6, 8

  32. [32]

    Adam: A Method for Stochastic Optimization

    Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization.CoRR, abs/1412.6980, 2014. 1

  33. [33]

    NeRSemble: Multi-view ra- diance field reconstruction of human heads.ACM TOG, 42: 1 – 14, 2023

    Tobias Kirschstein, Shenhan Qian, Simon Giebenhain, Tim Walter, and Matthias Nießner. NeRSemble: Multi-view ra- diance field reconstruction of human heads.ACM TOG, 42: 1 – 14, 2023. 6

  34. [34]

    Gghead: Fast and generalizable 3d gaussian heads, 2024

    Tobias Kirschstein, Simon Giebenhain, Jiapeng Tang, Markos Georgopoulos, and Matthias Nießner. Gghead: Fast and generalizable 3d gaussian heads, 2024. 2

  35. [35]

    Avat3r: Large ani- matable gaussian reconstruction model for high-fidelity 3d head avatars, 2025

    Tobias Kirschstein, Javier Romero, Artem Sevastopolsky, Matthias Nießner, and Shunsuke Saito. Avat3r: Large ani- matable gaussian reconstruction model for high-fidelity 3d head avatars, 2025. 2, 3

  36. [36]

    Point-based neural rendering with per- view optimization.Comput

    Georgios Kopanas, Julien Philip, Thomas Leimk ¨uhler, and George Drettakis. Point-based neural rendering with per- view optimization.Comput. Graph. Forum, 40, 2021. 2

  37. [37]

    DeepMVSHair: Deep hair modeling from sparse views

    Zhiyi Kuang, Yiyang Chen, Hongbo Fu, Kun Zhou, and Youyi Zheng. DeepMVSHair: Deep hair modeling from sparse views. InSIGGRAPH Asia 2022 Conference Pa- pers, New York, NY , USA, 2022. Association for Comput- ing Machinery. 3

  38. [38]

    Modular primitives for high-performance differentiable rendering.ACM Transac- tions on Graphics, 39(6), 2020

    Samuli Laine, Janne Hellsten, Tero Karras, Yeongho Seol, Jaakko Lehtinen, and Timo Aila. Modular primitives for high-performance differentiable rendering.ACM Transac- tions on Graphics, 39(6), 2020. 5

  39. [39]

    Loki: a unified multiphysics simulation framework for production.ACM Trans

    Steve Lesser, Alexey Stomakhin, Gilles Daviet, Joel Wret- born, John Edholm, Noh-Hoon Lee, Eston Schweickart, Xiao Zhai, Sean Flynn, and Andrew Moffat. Loki: a unified multiphysics simulation framework for production.ACM Trans. Graph., 41(4), 2022. 2

  40. [40]

    ShellNeRF: Learning a Controllable High-resolution Model of the Eye and Periocular Region

    Gengyan Li, Kripasindhu Sarkar, Abhimitra Meka, Marcel Buehler, Franziska Mueller, Paulo Gotardo, Otmar Hilliges, and Thabo Beeler. ShellNeRF: Learning a Controllable High-resolution Model of the Eye and Periocular Region. Computer Graphics Forum, 2024. 2

  41. [41]

    Uravatar: Universal relightable gaussian codec avatars

    Junxuan Li, Chen Cao, Gabriel Schwartz, Rawal Khirod- kar, Christian Richardt, Tomas Simon, Yaser Sheikh, and Shunsuke Saito. Uravatar: Universal relightable gaussian codec avatars. InACM SIGGRAPH 2024 Conference Pa- pers, 2024. 6

  42. [42]

    Tianye Li, Timo Bolkart, Michael. J. Black, Hao Li, and Javier Romero. Learning a model of facial shape and ex- pression from 4D scans. 36(6):194:1–194:17, 2017. 2, 3, 4, 5

  43. [43]

    Animatable gaussians: Learning pose-dependent gaussian maps for high-fidelity human avatar modeling

    Zhe Li, Zerong Zheng, Lizhen Wang, and Yebin Liu. Animatable gaussians: Learning pose-dependent gaussian maps for high-fidelity human avatar modeling. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 2, 3

  44. [44]

    Hhavatar: Gaussian head avatar with dynamic hairs.arXiv e-prints, pages arXiv–2312, 2023

    Zhanfeng Liao, Yuelang Xu, Zhe Li, Qijing Li, Boyao Zhou, Ruifeng Bai, Di Xu, Hongwen Zhang, and Yebin Liu. Hhavatar: Gaussian head avatar with dynamic hairs.arXiv e-prints, pages arXiv–2312, 2023. 2, 3

  45. [45]

    Hades: Human avatar with dynamic explicit hair strands

    Zhanfeng Liao, Hanzhang Tu, Cheng Peng, Hongwen Zhang, Boyao Zhou, and Yebin Liu. Hades: Human avatar with dynamic explicit hair strands. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 12318–12327, 2025. 2, 3

  46. [46]

    Robust high-resolution video mat- ting with temporal guidance, 2021

    Shanchuan Lin, Linjie Yang, Imran Saleemi, and Soumyadip Sengupta. Robust high-resolution video mat- ting with temporal guidance, 2021. 3

  47. [47]

    Stephen Lombardi, Tomas Simon, Gabriel Schwartz, Michael Zollhoefer, Yaser Sheikh, and Jason M. Saragih. Mixture of volumetric primitives for efficient neural ren- dering.ACM TOG, 40:1 – 13, 2021. 2

  48. [48]

    arXiv preprint arXiv:2402.10483 (2024)

    Haimin Luo, Min Ouyang, Zijun Zhao, Suyi Jiang, Long- wen Zhang, Qixuan Zhang, Wei Yang, Lan Xu, and Jingyi Yu. GaussianHair: Hair modeling and rendering with light- aware gaussians.arXiv preprint arXiv:2402.10483, 2024. 3, 4

  49. [49]

    Real- time hair simulation with neural interpolation.IEEE Trans- actions on Visualization and Computer Graphics, 28(4): 1894–1905, 2020

    Qing Lyu, Menglei Chai, Xiang Chen, and Kun Zhou. Real- time hair simulation with neural interpolation.IEEE Trans- actions on Visualization and Computer Graphics, 28(4): 1894–1905, 2020. 5

  50. [50]

    Jewett, Si- mon Venshtain, Christopher Heilman, Yueh-Tung Chen, Sidi Fu, Mohamed Ezzeldin A

    Julieta Martinez, Emily Kim, Javier Romero, Timur Bagautdinov, Shunsuke Saito, Shoou-I Yu, Stuart Ander- son, Michael Zollh¨ofer, Te-Li Wang, Shaojie Bai, Chenghui Li, Shih-En Wei, Rohan Joshi, Wyatt Borsos, Tomas Si- mon, Jason Saragih, Paul Theodosis, Alexander Greene, Anjani Josyula, Silvio Mano Maeta, Andrew I. Jewett, Si- mon Venshtain, Christopher H...

  51. [51]

    Srinivasan, Matthew Tancik, Jonathan T

    Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. NeRF: Representing scenes as neural radiance fields for view synthesis. InECCV, pages 405–421, 2020. 3

  52. [52]

    Strand-accurate multi-view hair capture

    Giljoo Nam, Chenglei Wu, Min H Kim, and Yaser Sheikh. Strand-accurate multi-view hair capture. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 155–164, 2019. 3

  53. [53]

    Compgs: Smaller and faster gaussian splatting with vector quantiza- tion.ECCV, 2024

    KL Navaneet, Kossar Pourahmadi Meibodi, Soroush Ab- basi Koohpayegani, and Hamed Pirsiavash. Compgs: Smaller and faster gaussian splatting with vector quantiza- tion.ECCV, 2024. 2

  54. [54]

    V olumetric portrait avatar

    Jalees Nehvi, Berna Kabadayi, Julien Valentin, and Justus Thies. V olumetric portrait avatar. InPattern Recognition: 46th DAGM German Conference, DAGM GCPR 2024, Mu- nich, Germany, September 10–13, 2024, Proceedings, Part II, page 3–19, 2025. 2

  55. [55]

    Black, and Justus Thies

    Mirela Ostrek, Michael J. Black, and Justus Thies. Hair- free: Compositional 2d head prior for text-driven 360° bald texture synthesis. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. 3

  56. [56]

    ASH: animatable gaus- sian splats for efficient and photoreal human rendering

    Haokai Pang, Heming Zhu, Adam Kortylewski, Christian Theobalt, and Marc Habermann. ASH: animatable gaus- sian splats for efficient and photoreal human rendering. In CVPR, pages 1165–1175. IEEE, 2024. 2

  57. [57]

    Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. A. Osman, Dimitrios Tzionas, and Michael J. Black. Expressive body capture: 3D hands, face, and body from a single image. InCVPR, pages 10975– 10985, 2019. 3

  58. [58]

    Paysan, R

    P. Paysan, R. Knothe, B. Amberg, S. Romdhani, and T. Vet- ter. A 3D face model for pose and illumination invariant face recognition.IEEE International Conference on Ad- vanced Video and Signal based Surveillance (AVSS), 2009. 6

  59. [59]

    Pois- son image editing

    Patrick P ´erez, Michel Gangnet, and Andrew Blake. Pois- son image editing. InACM SIGGRAPH 2003 Papers, page 313–318, New York, NY , USA, 2003. Association for Com- puting Machinery. 5

  60. [60]

    Barron, and Ben Milden- hall

    Ben Poole, Ajay Jain, Jonathan T. Barron, and Ben Milden- hall. Dreamfusion: Text-to-3d using 2d diffusion.arXiv,

  61. [61]

    Vhap: Versatile head alignment with adap- tive appearance priors, 2024

    Shenhan Qian. Vhap: Versatile head alignment with adap- tive appearance priors, 2024. 2, 5

  62. [62]

    GaussianAvatars: Photorealistic head avatars with rigged 3D gaussians

    Shenhan Qian, Tobias Kirschstein, Liam Schoneveld, Da- vide Davoli, Simon Giebenhain, and Matthias Nießner. GaussianAvatars: Photorealistic head avatars with rigged 3D gaussians. InCVPR, pages 20299–20309, 2024. 1, 2, 3, 4, 5, 6, 7

  63. [63]

    An efficient repre- sentation for irradiance environment maps.SIGGRAPH,

    Ravi Ramamoorthi and Pat Hanrahan. An efficient repre- sentation for irradiance environment maps.SIGGRAPH,

  64. [64]

    Neural strands: Learning hair geometry and appearance from multi-view images

    Radu Alexandru Rosu, Shunsuke Saito, Ziyan Wang, Chen- glei Wu, Sven Behnke, and Giljoo Nam. Neural strands: Learning hair geometry and appearance from multi-view images. InComputer Vision – ECCV 2018: 15th European Conference, 2022. 3

  65. [65]

    Radu Alexandru Rosu, Keyu Wu, Yao Feng, Youyi Zheng, and Michael J. Black. DiffLocks: Generating 3d hair from a single image using diffusion models. InIEEE/CVF Conf. on Computer Vision and Pattern Recognition(CVPR),

  66. [66]

    Relightable gaussian codec avatars

    Shunsuke Saito, Gabriel Schwartz, Tomas Simon, Junxuan Li, and Giljoo Nam. Relightable gaussian codec avatars. In CVPR, pages 130–141, 2024. 2, 3

  67. [67]

    Otaduy, and Dan Casas

    Igor Santesteban, Nils Thuerey, Miguel A. Otaduy, and Dan Casas. Self-supervised collision handling via generative 3d garment models for virtual try-on. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11763–11773, 2021. 3

  68. [68]

    Otaduy, and Dan Casas

    Igor Santesteban, Miguel A. Otaduy, and Dan Casas. Snug: Self-supervised neural dynamic garments. InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 8140–8150, 2022. 3

  69. [69]

    SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting

    Zhijing Shao, Zhaolong Wang, Zhuang Li, Duotun Wang, Xiangru Lin, Yu Zhang, Mingming Fan, and Zeyu Wang. SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 2, 3

  70. [70]

    Ct2hair: High-fidelity 3d hair modeling using com- puted tomography.ACM Transactions on Graphics, 42(4): 1–13, 2023

    Yuefan Shen, Shunsuke Saito, Ziyan Wang, Olivier Maury, Chenglei Wu, Jessica Hodgins, Youyi Zheng, and Giljoo Nam. Ct2hair: High-fidelity 3d hair modeling using com- puted tomography.ACM Transactions on Graphics, 42(4): 1–13, 2023. 3

  71. [71]

    Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction

    Vanessa Sklyarova, Jenya Chelishev, Andreea Dogaru, Igor Medvedev, Victor Lempitsky, and Egor Zakharov. Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023. 3, 4, 6, 8, 1

  72. [72]

    Black, and Justus Thies

    Vanessa Sklyarova, Egor Zakharov, Otmar Hilliges, Michael J. Black, and Justus Thies. Text-conditioned gen- erative model of 3d strand-based human hairstyles. In IEEE/CVF Conf. on Computer Vision and Pattern Recog- nition (CVPR), 2024. 2, 3

  73. [73]

    Black, and Justus Thies

    Vanessa Sklyarova, Berna Kabadayi, Anastasios Yian- nakidis, Giorgio Becherini, Michael J. Black, and Justus Thies. Neuralfur: Neuralfur: Animal fur reconstruction from multi-view images.ArXiv, 2026. 2

  74. [74]

    Quaffure: Real-time quasi-static neural hair simulation

    Tuur Stuyck, Gene Wei-Chin Lin, Egor Larionov, Hsiao-yu Chen, Aljaz Bozic, Nikolaos Sarafianos, and Doug Roble. Quaffure: Real-time quasi-static neural hair simulation. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, 2025. 3

  75. [75]

    Ca- phy: Capturing physical properties for animatable human avatars

    Zhaoqi Su, Liangxiao Hu, Siyou Lin, Hongwen Zhang, Shengping Zhang, Justus Thies, and Yebin Liu. Ca- phy: Capturing physical properties for animatable human avatars. InIEEE/CVF International Conference on Com- puter Vision (ICCV), 2023. 3

  76. [76]

    Dr.hair: Reconstructing scalp- connected hair strands without pre-training via differen- tiable rendering of line segments

    Yusuke Takimoto, Hikari Takehara, Hiroyuki Sato, Zi- hao Zhu, and Bo Zheng. Dr.hair: Reconstructing scalp- connected hair strands without pre-training via differen- tiable rendering of line segments. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20601–20611, 2024. 3

  77. [77]

    3D face tracking from 2D video through iterative dense UV to image flow

    Felix Taubner, Prashant Raina, Mathieu Tuli, Eu Wern Teh, Chul Lee, and Jinmiao Huang. 3D face tracking from 2D video through iterative dense UV to image flow. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1227–1237, 2024. 2

  78. [78]

    Felix Taubner, Ruihang Zhang, Mathieu Tuli, and David B. Lindell. Cap4d: Creating animatable 4d portrait avatars with morphable multi-view diffusion models, 2024. 3

  79. [79]

    Gaussianheads: End-to-end learning of drivable gaussian head avatars from coarse-to-fine representations.ACM Transactions on Graphics (ToG), 43(6):1–12, 2024

    Kartik Teotia, Hyeongwoo Kim, Pablo Garrido, Marc Habermann, Mohamed Elgharib, and Christian Theobalt. Gaussianheads: End-to-end learning of drivable gaussian head avatars from coarse-to-fine representations.ACM Transactions on Graphics (ToG), 43(6):1–12, 2024. 2

  80. [80]

    Fml: Face model learning from videos

    Ayush Tewari, Florian Bernard, Pablo Garrido, Gaurav Bharaj, Mohamed Elgharib, Hans-Peter Seidel, Patrick P´erez, Michael Zollhofer, and Christian Theobalt. Fml: Face model learning from videos. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10812–10822, 2019. 2

Showing first 80 references.