arxiv: 2605.05636 · v1 · submitted 2026-05-07 · 💻 cs.CV · cs.GR

Recognition: unknown

Learning a Delighting Prior for Facial Appearance Capture in the Wild

Yuxuan Han , Xin Ming , Tianxiao Li , Zhuofan Shen , Qixuan Zhang , Lan Xu , Feng Xu

Authors on Pith no claims yet

Pith reviewed 2026-05-08 14:59 UTC · model grok-4.3

classification 💻 cs.CV cs.GR

keywords delighting priorfacial reflectancein-the-wild captureinverse renderingrelightable scansDataset Latent ModulationOLAT dataLight Stage data

0 comments

The pith

A delighting network trained on studio data creates a prior that extracts high-quality facial reflectance from casual smartphone videos.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper moves away from complex model-based inverse rendering toward learning a delighting prior that separates reflectance from unknown lighting in facial videos. It trains this network on OLAT and Light Stage data by using a modulation approach to combine the two sources without style conflicts. The resulting prior supports a straightforward optimization pipeline that estimates reflectance and related properties directly from everyday video inputs. This yields better results than earlier methods and produces a new collection of detailed relightable facial scans. The work aims to make high-end appearance capture practical outside controlled studios.

Core claim

By training a delighting network on the OLAT dataset and rendered Light Stage scans with Dataset Latent Modulation, the authors learn a prior that disentangles reflectance from illumination. Conditioning the network on learnable source-aware tokens decouples dataset styles from physical delighting rules, allowing the prior to constrain optimization for accurate reflectance estimation from in-the-wild casual videos and outperforming prior approaches by a large margin. The same pipeline converts the multi-view NeRSemble dataset into a large set of 4K-resolution relightable scans.

What carries the argument

Dataset Latent Modulation (DLM), which conditions the delighting network on learnable source-aware tokens to separate dataset-specific styles from universal delighting principles.

If this is right

A simple optimization process suffices for high-quality reflectance estimation from casual video inputs.
Reflectance estimates surpass those from previous model-based inverse rendering techniques by a large margin.
Existing multi-view face datasets can be converted into large-scale collections of 4K relightable scans.
Open-sourcing the trained model and the resulting scan dataset supports further development of photorealistic digital humans.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The prior might be adapted to related inverse problems such as full-scene lighting estimation or material recovery from consumer video.
Further fine-tuning the network on broader real-world video distributions could reduce any remaining domain-gap artifacts.
Consumer applications could emerge that let users record themselves on a phone and obtain immediately usable relightable face models.

Load-bearing premise

The delighting network trained on controlled studio lighting data will generalize without artifacts to the diverse and uncontrolled illumination present in arbitrary smartphone videos.

What would settle it

Side-by-side visual and quantitative comparison of the estimated reflectance maps against known ground-truth reflectance on a collection of in-the-wild videos captured under measured lighting would confirm or refute the claimed quality gains.

Figures

Figures reproduced from arXiv: 2605.05636 by Feng Xu, Lan Xu, Qixuan Zhang, Tianxiao Li, Xin Ming, Yuxuan Han, Zhuofan Shen.

**Figure 1.** Figure 1: Given a smartphone sequence captured in the wild (4 sampled views shown here), our method reconstructs high-quality facial assets, which can view at source ↗

**Figure 3.** Figure 3: Example training pairs generated from our mixing datasets. view at source ↗

**Figure 4.** Figure 4: Architecture and training pipeline of our base delighting network view at source ↗

**Figure 5.** Figure 5: Comparison to prior arts in terms of diffuse albedo prediction on the FaceOLAT (the first two rows) and 3DRFE dataset (the last two rows). view at source ↗

**Figure 6.** Figure 6: Comparison to SwitchLight on in-the-wild face images sampled from the FFHQ dataset in terms of diffuse albedo prediction. view at source ↗

**Figure 7.** Figure 7: Comparison to a portrait shadow removal method, GPSR. view at source ↗

**Figure 10.** Figure 10: Ablation study of the Dataset Latent Modulation (DLM) technique view at source ↗

**Figure 8.** Figure 8: Ablation study on mixing dataset training in terms of diffuse albedo view at source ↗

**Figure 9.** Figure 9: Ablation study on mixing dataset training and mixing training strat view at source ↗

**Figure 11.** Figure 11: Comparison to existing facial appearance capture methods in terms of diffuse albedo reconstruction. view at source ↗

**Figure 12.** Figure 12: Ablation study of the detail enhancement network in diffuse albedo view at source ↗

**Figure 13.** Figure 13: Comparison to WildCap on diffuse albedo reconstruction. We show view at source ↗

**Figure 15.** Figure 15: Evaluation on reconstruction fidelity of our facial appearance cap view at source ↗

**Figure 16.** Figure 16: Facial appearance capture results of our method on subjects from diverse ethnic groups captured under complex in-the-wild environments. view at source ↗

**Figure 16.** Figure 16: We believe that collecting a more balanced dataset across view at source ↗

read the original abstract

High-quality facial appearance capture has traditionally required costly studio recording. Recent works consider an in-the-wild smartphone-based setup; however, their model-based inverse rendering paradigm struggles with the complex disentanglement of reflectance from unknown illumination. To bridge this gap, we propose to shift the paradigm into training a powerful delighting network as a prior to constrain the optimization. We leverage the OLAT dataset and the rendered Light Stage scans for training, and propose Dataset Latent Modulation (DLM) to seamlessly integrate these heterogeneous data sources. Specifically, by conditioning the core network on learnable source-aware tokens, we decouple dataset-specific styles from physical delighting principles, enabling the emergence of a delighting prior that outperforms existing proprietary models. This powerful delighting prior enables a simple and automatic appearance capture pipeline that achieves high-quality reflectance estimation from casual video inputs, outperforming prior arts by a large margin. Furthermore, we leverage our appearance capture method to transform the multi-view NeRSemble dataset into NeRSemble-Scan, a large-scale collection of 4K-resolution relightable scans. By open-sourcing our model and the NeRSemble-Scan dataset, we democratize high-end facial capture and provide a new foundation for the research community to build photorealistic digital humans.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

read the letter

The paper's main advance is Dataset Latent Modulation using source-aware tokens to train a single delighting network on mixed OLAT and Light Stage data, plus the release of a new open 4K relightable scan dataset from NeRSemble. They shift the capture pipeline to rely on this learned prior for reflectance estimation from casual videos instead of pure inverse rendering, which is a reasonable practical step for lowering the cost of high-quality facial appearance work. Open-sourcing both the model and the new scan collection is a clear positive that gives others concrete data to build on for digital humans and relighting tasks. The source-aware tokens are presented as the mechanism that lets the network separate physical delighting from dataset-specific styles, which addresses a real fusion problem when combining heterogeneous training sources. The soft spot is generalization to truly arbitrary smartphone video. Training happens on controlled studio captures and renders with known lighting and high SNR, while wild inputs bring unknown environment maps, sensor shifts, motion blur, and non-Lambertian effects. If the tokens do not fully isolate the physical component, some shading can leak into the recovered reflectance during optimization, which would weaken the large-margin outperformance claim. The abstract does not include the actual metrics or ablation details, so the strength of that claim depends on what the full results show. This is for graphics researchers focused on facial capture, relightable avatars, and appearance modeling who need accessible data or a starting prior. A reader working on similar inverse-rendering or delighting pipelines would find the dataset and the modulation technique directly usable. It deserves peer review because the contributions are specific and the dataset adds measurable value even if the wild-video robustness needs closer scrutiny from referees.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes shifting from model-based inverse rendering to a learned delighting prior for high-quality facial reflectance capture from casual smartphone videos. It trains a delighting network on OLAT captures and rendered Light Stage scans using Dataset Latent Modulation (DLM) with learnable source-aware tokens to integrate heterogeneous data sources while decoupling styles from physical principles. The resulting prior constrains an optimization pipeline for automatic appearance capture, with claims of large-margin outperformance over prior arts; the authors also derive and open-source the NeRSemble-Scan dataset of 4K relightable scans from NeRSemble.

Significance. If the generalization claims hold, the work has substantial significance by democratizing studio-quality facial appearance capture for in-the-wild inputs and providing an open-source model plus large-scale relightable dataset as a foundation for photorealistic digital human research. The data-driven prior approach and open-sourcing are notable strengths.

major comments (3)

[Abstract and §4] Abstract and §4 (Experiments): The central claim of 'large-margin outperformance' and high-quality reflectance estimation from casual videos is stated without any quantitative metrics (e.g., albedo error, PSNR/SSIM on reflectance maps), ablation results on the DLM tokens, or error analysis. These details are load-bearing for validating the delighting prior's effectiveness and must be added with specific numbers and controls.
[§3] §3 (DLM and delighting network): The assumption that source-aware tokens fully decouple dataset-specific styles from physical delighting is critical. Since tokens are learned from the same OLAT/Light Stage sources used for training, the paper must demonstrate (via ablations or feature visualizations) that they prevent residual shading from being baked into the recovered reflectance, rather than merely fitting training-domain statistics.
[§4] §4 (wild-video results): Generalization from controlled studio training data (single-source OLAT, known camera response, high SNR) to arbitrary in-the-wild smartphone videos (unknown environment maps, motion blur, sensor shifts, non-Lambertian effects) is the weakest and most load-bearing assumption. The manuscript requires explicit cross-domain quantitative tests, failure-case analysis, and comparisons showing no domain-shift artifacts in the final reflectance maps.

minor comments (2)

[§3] Clarify the precise mathematical formulation of DLM conditioning (how tokens are injected into the network layers) and any hyper-parameters controlling token learning.
[Figures] Ensure all figures showing reflectance results include side-by-side comparisons with ground-truth or prior methods, with consistent lighting and scale.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. The comments highlight important areas for strengthening the quantitative validation and generalization analysis, which we will address in the revised manuscript.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (Experiments): The central claim of 'large-margin outperformance' and high-quality reflectance estimation from casual videos is stated without any quantitative metrics (e.g., albedo error, PSNR/SSIM on reflectance maps), ablation results on the DLM tokens, or error analysis. These details are load-bearing for validating the delighting prior's effectiveness and must be added with specific numbers and controls.

Authors: We agree that explicit quantitative metrics would strengthen the claims. While the current manuscript prioritizes qualitative demonstrations for in-the-wild results, we have internal quantitative evaluations on held-out studio data. In the revision, we will add a table in §4 reporting PSNR, SSIM, and albedo error on test sets from OLAT and Light Stage data, including ablations on the DLM tokens and a brief error analysis. This will provide the specific numbers and controls requested. revision: yes
Referee: [§3] §3 (DLM and delighting network): The assumption that source-aware tokens fully decouple dataset-specific styles from physical delighting is critical. Since tokens are learned from the same OLAT/Light Stage sources used for training, the paper must demonstrate (via ablations or feature visualizations) that they prevent residual shading from being baked into the recovered reflectance, rather than merely fitting training-domain statistics.

Authors: We appreciate this point on the critical assumption. To demonstrate decoupling, the revised §3 will include an ablation removing the source-aware tokens (showing increased style leakage and degraded delighting) and feature visualizations such as t-SNE embeddings of network activations, confirming that tokens isolate source styles while the core network enforces physical principles without baking in residual shading. revision: yes
Referee: [§4] §4 (wild-video results): Generalization from controlled studio training data (single-source OLAT, known camera response, high SNR) to arbitrary in-the-wild smartphone videos (unknown environment maps, motion blur, sensor shifts, non-Lambertian effects) is the weakest and most load-bearing assumption. The manuscript requires explicit cross-domain quantitative tests, failure-case analysis, and comparisons showing no domain-shift artifacts in the final reflectance maps.

Authors: Generalization is indeed central. Direct quantitative cross-domain tests on reflectance error are not feasible without ground-truth data for wild videos. We will add a failure-case analysis subsection in §4 with examples of challenging inputs (motion blur, extreme lighting) and comparisons to baselines, plus additional qualitative results from diverse smartphone videos to illustrate robustness and absence of obvious domain-shift artifacts. revision: partial

standing simulated objections not resolved

Direct quantitative cross-domain tests (e.g., albedo error) on in-the-wild smartphone videos, due to the inherent lack of ground-truth reflectance maps for such casual captures.

Circularity Check

0 steps flagged

No circularity: derivation relies on external training data and independent optimization

full rationale

The paper trains a delighting network on separate studio datasets (OLAT and rendered Light Stage scans) using the proposed Dataset Latent Modulation (DLM) to condition on source-aware tokens, then applies the resulting network as a prior within an optimization pipeline for in-the-wild reflectance recovery. No equations, definitions, or steps in the provided text reduce the final reflectance estimates or the 'powerful delighting prior' claim to fitted parameters or self-referential constructions inside the paper. The central pipeline is data-driven training followed by application to new video inputs, with no self-citation load-bearing the uniqueness or the delighting principles themselves. This is a standard empirical ML pipeline with independent content.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unproven generalization of a learned delighting function from controlled studio captures to arbitrary smartphone illumination; no free parameters are explicitly named in the abstract, but the learnable source-aware tokens function as dataset-specific fitted components.

free parameters (1)

source-aware tokens in DLM
Learnable embeddings that modulate the network per dataset; their values are optimized during training and directly affect the extracted delighting prior.

axioms (1)

domain assumption Studio OLAT and Light Stage data contain sufficient physical lighting variation to learn a prior that applies to in-the-wild smartphone videos.
Invoked when claiming the trained network serves as a general delighting prior for casual video inputs.

pith-pipeline@v0.9.0 · 5540 in / 1374 out tokens · 28017 ms · 2026-05-08T14:59:38.793218+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

186 extracted references · 15 canonical work pages · 6 internal anchors

[1]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Permutosdf: Fast multi-view reconstruction with implicit surfaces using permutohedral lattices , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[2]

Implicit Geometric Regularization for Learning Shapes , year =

Gropp, Amos and Yariv, Lior and Haim, Niv and Atzmon, Matan and Lipman, Yaron , booktitle =. Implicit Geometric Regularization for Learning Shapes , year =
[3]

RoI Tanh-polar transformer network for face parsing in the wild , journal =

Yiming Lin and Jie Shen and Yujiang Wang and Maja Pantic , keywords =. RoI Tanh-polar transformer network for face parsing in the wild , journal =. 2021 , issn =. doi:https://doi.org/10.1016/j.imavis.2021.104190 , url =

work page doi:10.1016/j.imavis.2021.104190 2021
[4]

Acm Siggraph , volume=

Physically-based shading at disney , author=. Acm Siggraph , volume=. 2012 , organization=

2012
[5]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

WildLight: In-the-wild Inverse Rendering with a Flashlight , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[6]

ACM Transactions on Graphics (ToG) , volume=

Instant neural graphics primitives with a multiresolution hash encoding , author=. ACM Transactions on Graphics (ToG) , volume=. 2022 , publisher=

2022
[7]

ACM Transactions on Graphics (ToG) , volume=

Nerfactor: Neural factorization of shape and reflectance under an unknown illumination , author=. ACM Transactions on Graphics (ToG) , volume=. 2021 , publisher=

2021
[8]

CVPR , year=

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , author=. CVPR , year=
[9]

uller, Thomas and Fidler, Sanja , title =

Munkberg, Jacob and Hasselgren, Jon and Shen, Tianchang and Gao, Jun and Chen, Wenzheng and Evans, Alex and M\"uller, Thomas and Fidler, Sanja , title = ". Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , month =. 2022 , pages =

2022
[10]

Advances in Neural Information Processing Systems , volume=

Shape, light, and material decomposition from images using Monte Carlo rendering and denoising , author=. Advances in Neural Information Processing Systems , volume=
[11]

SIGGRAPH , year=

NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images , author=. SIGGRAPH , year=
[12]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Modeling indirect illumination for inverse rendering , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[13]

Computer Graphics Forum , volume=

Accurate Real-time 3D Gaze Tracking Using a Lightweight Eyeball Calibration , author=. Computer Graphics Forum , volume=. 2020 , organization=

2020
[14]

, author=

High-quality capture of eyes. , author=. ACM Trans. Graph. , volume=
[15]

ACM Transactions on Graphics (TOG) , volume=

Neural light transport for relighting and view synthesis , author=. ACM Transactions on Graphics (TOG) , volume=. 2021 , publisher=

2021
[16]

ACM Transactions on Graphics (TOG) , volume=

Deep relightable appearance models for animatable faces , author=. ACM Transactions on Graphics (TOG) , volume=. 2021 , publisher=

2021
[17]

Proceedings of the 27th annual conference on Computer graphics and interactive techniques , pages=

Acquiring the reflectance field of a human face , author=. Proceedings of the 27th annual conference on Computer graphics and interactive techniques , pages=
[18]

ACM SIGGRAPH 2015 Talks , pages=

A practical and controllable hair and fur model for production path tracing , author=. ACM SIGGRAPH 2015 Talks , pages=

2015
[19]

arXiv preprint arXiv:2302.14859 , year=

BakedSDF: Meshing Neural SDFs for Real-Time View Synthesis , author=. arXiv preprint arXiv:2302.14859 , year=

work page arXiv
[20]

Seminal graphics: pioneering efforts that shaped the field , pages=

Marching cubes: A high resolution 3D surface construction algorithm , author=. Seminal graphics: pioneering efforts that shaped the field , pages=
[21]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

ObjectSDF++: Improved Object-Compositional Neural Implicit Surfaces , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[22]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

A morphable face albedo model , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[23]

ACM Transactions on Graphics (TOG) , volume=

Post-production facial performance relighting using reflectance transfer , author=. ACM Transactions on Graphics (TOG) , volume=. 2007 , publisher=

2007
[24]

Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part II 16 , pages=

Raft: Recurrent all-pairs field transforms for optical flow , author=. Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part II 16 , pages=. 2020 , organization=

2020
[25]

, author=

Model-based teeth reconstruction. , author=. ACM Trans. Graph. , volume=
[26]

Recent Advances in Facial Appearance Capture

Klehm, Oliver and Rousselle, Fabrice and Papas, Marios and Bradley, Derek and Hery, Christophe and Bickel, Bernd and Jarosz, Wojciech and Beeler, Thabo. Recent Advances in Facial Appearance Capture. Computer Graphics Forum (Proceedings of Eurographics - State of the Art Reports). 2015. doi:10/f7mb4b

2015
[27]

ACM Transactions on Graphics , year=

High-Fidelity 3D Digital Human Head Creation from RGB-D Selfies , author=. ACM Transactions on Graphics , year=
[28]

Acm siggraph 2009 courses , pages=

The digital emily project: photoreal facial modeling and animation , author=. Acm siggraph 2009 courses , pages=

2009
[29]

ACM SIGGRAPH 2013 Posters , pages=

Digital ira: Creating a real-time photoreal digital actor , author=. ACM SIGGRAPH 2013 Posters , pages=

2013
[30]

ACM Transactions on Graphics (TOG) , volume=

Multiview face capture using polarized spherical gradient illumination , author=. ACM Transactions on Graphics (TOG) , volume=. 2011 , publisher=

2011
[31]

, author=

Rapid Acquisition of Specular and Diffuse Normal Maps from Polarized Spherical Gradient Illumination. , author=. Rendering Techniques , volume=
[32]

ACM Transactions on Graphics (ToG) , volume=

Analysis of human faces using a measurement-based skin reflectance model , author=. ACM Transactions on Graphics (ToG) , volume=. 2006 , publisher=

2006
[33]

2020 , publisher=

Single-shot high-quality facial geometry and skin appearance capture , author=. 2020 , publisher=

2020
[34]

Eurographics , year=

Improved Lighting Models for Facial Appearance Capture , author=. Eurographics , year=
[35]

2018 , publisher=

Practical dynamic facial appearance modeling and acquisition , author=. 2018 , publisher=

2018
[36]

SIGGRAPH Asia , volume=

The light stages and their applications to photoreal digital actors , author=. SIGGRAPH Asia , volume=
[37]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Sunstage: Portrait reconstruction and relighting using the sun as a light stage , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[38]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

High-res facial appearance capture from polarized smartphone images , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[39]

Proceedings of the 28th annual conference on Computer graphics and interactive techniques , pages=

A signal-processing framework for inverse rendering , author=. Proceedings of the 28th annual conference on Computer graphics and interactive techniques , pages=
[40]

Li, Tianye and Bolkart, Timo and Black, Michael. J. and Li, Hao and Romero, Javier , journal =. Learning a model of facial shape and expression from. 2017 , pages =

2017
[41]

ACM Transactions on Graphics (ToG) , volume=

3d morphable face models—past, present, and future , author=. ACM Transactions on Graphics (ToG) , volume=. 2020 , publisher=

2020
[42]

NeurIPS , year=

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction , author=. NeurIPS , year=
[43]

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages=

Ref-nerf: Structured view-dependent appearance for neural radiance fields , author=. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages=. 2022 , organization=

2022
[44]

Computer Graphics Forum , volume=

Neural fields in visual computing and beyond , author=. Computer Graphics Forum , volume=. 2022 , organization=

2022
[45]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Iron: Inverse rendering by optimizing neural sdfs and materials from photometric images , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[46]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[47]

2020 , booktitle=

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis , author=. 2020 , booktitle=

2020
[48]

Advances in Neural Information Processing Systems , volume=

Volume rendering of neural implicit surfaces , author=. Advances in Neural Information Processing Systems , volume=
[49]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Unisurf: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[50]

Advances in Neural Information Processing Systems , volume=

Multiview neural surface reconstruction by disentangling geometry and appearance , author=. Advances in Neural Information Processing Systems , volume=
[51]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Mip-nerf 360: Unbounded anti-aliased neural radiance fields , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[52]

arXiv preprint arXiv:2008.03824 , year=

Neural reflectance fields for appearance acquisition , author=. arXiv preprint arXiv:2008.03824 , year=

work page arXiv 2008
[53]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Shadowneus: Neural sdf reconstruction by shadow ray supervision , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[54]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Physg: Inverse rendering with spherical gaussians for physics-based material editing and relighting , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[55]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Nerd: Neural reflectance decomposition from image collections , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
[56]

ICCV , year=

Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields , author=. ICCV , year=
[57]

ACM Transactions on Graphics (TOG) , volume=

Performance relighting and reflectance transformation with time-multiplexed illumination , author=. ACM Transactions on Graphics (TOG) , volume=. 2005 , publisher=

2005
[58]

ACM SIGGRAPH 2010 papers , pages=

High-quality single-shot capture of facial geometry , author=. ACM SIGGRAPH 2010 papers , pages=

2010
[59]

in-the-wild

AvatarMe: Realistically Renderable 3D Facial Reconstruction" in-the-wild" , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[60]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Learning a 3D Morphable Face Reflectance Model From Low-Cost Data , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[61]

ACM Transactions on Graphics (TOG) , volume=

High-fidelity facial reflectance and geometry inference from an unconstrained image , author=. ACM Transactions on Graphics (TOG) , volume=. 2018 , publisher=

2018
[62]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

NeuFace: Realistic 3D Neural Face Rendering from Multi-view Images , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[63]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

FitMe: Deep Photorealistic 3D Morphable Model Avatars , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[64]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Learning formation of physically-based face attributes , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[65]

ACM Transactions on Graphics (TOG) , volume=

EyeNeRF: a hybrid representation for photorealistic synthesis, animation and relighting of human eyes , author=. ACM Transactions on Graphics (TOG) , volume=. 2022 , publisher=

2022
[66]

ACM Transactions on Graphics (TOG) , volume=

AvatarRex: Real-time Expressive Full-body Avatars , author=. ACM Transactions on Graphics (TOG) , volume=. 2023 , publisher=

2023
[67]

SIGGRAPH Asia 2022 Conference Papers , pages=

Capturing and animation of body and clothing from monocular video , author=. SIGGRAPH Asia 2022 Conference Papers , pages=

2022
[68]

Computer Graphics Forum , volume=

Practical face reconstruction via differentiable ray tracing , author=. Computer Graphics Forum , volume=. 2021 , organization=

2021
[69]

Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , year=

Paraperas Papantoniou, Foivos and Lattas, Alexandros and Moschoglou, Stylianos and Zafeiriou, Stefanos , title=. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , year=
[70]

arXiv preprint arXiv:2210.04847 , year=

Nerfacc: A general nerf acceleration toolbox , author=. arXiv preprint arXiv:2210.04847 , year=

work page arXiv
[71]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Magic3d: High-resolution text-to-3d content creation , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[72]

Accelerating 3D Deep Learning with PyTorch3D

Accelerating 3d deep learning with pytorch3d , author=. arXiv preprint arXiv:2007.08501 , year=

work page internal anchor Pith review arXiv 2007
[73]

2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018) , pages=

Morphable face models-an open framework , author=. 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018) , pages=. 2018 , organization=

2018
[74]

IEEE transactions on image processing , volume=

Image quality assessment: from error visibility to structural similarity , author=. IEEE transactions on image processing , volume=. 2004 , publisher=

2004
[75]

2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG) , pages=

Effect of illumination on automatic expression recognition: a novel 3D relightable facial database , author=. 2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG) , pages=. 2011 , organization=

2011
[76]

2021 , publisher=

Human hair inverse rendering using multi-view photometric data , author=. 2021 , publisher=

2021
[77]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Strand-accurate multi-view hair capture , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[78]

Proceedings of the 28th annual conference on Computer graphics and interactive techniques , pages=

An efficient representation for irradiance environment maps , author=. Proceedings of the 28th annual conference on Computer graphics and interactive techniques , pages=
[79]

CVPR , year=

Han, Yuxuan and Lyu, Junfeng and Xu, Feng , title =. CVPR , year=
[80]

Computer Graphics Forum , volume=

Neural shading fields for efficient facial inverse rendering , author=. Computer Graphics Forum , volume=. 2023 , organization=

2023

Showing first 80 references.