From Phase to Phenomenon: Self-Supervised Learning of Subsurface Scattering with Minimal Phase-shift Inputs

Andreas Engelhardt; Arjun Majumdar; Hendrik PA. Lensch; Raphael Braun

arxiv: 2606.29461 · v1 · pith:NYOFQCC5new · submitted 2026-06-28 · 💻 cs.CV

From Phase to Phenomenon: Self-Supervised Learning of Subsurface Scattering with Minimal Phase-shift Inputs

Arjun Majumdar , Raphael Braun , Andreas Engelhardt , Hendrik PA. Lensch This is my paper

Pith reviewed 2026-06-30 07:29 UTC · model grok-4.3

classification 💻 cs.CV

keywords self-supervised learningsubsurface scatteringphase-shift profilometrylight transportrelightingaugmentationscomputer visionscattering footprints

0 comments

The pith

Self-supervised pretraining on eight phase-shift images produces generalizable subsurface scattering representations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces a self-supervised pretraining method for learning subsurface scattering representations using a minimal set of eight high-frequency phase-shift profilometry images per view captured in a stereo projector-camera setup. By training an encoder across multiple views and objects with a custom augmentation strategy tailored to PSP data, the method extracts light transport features that outperform standard augmentations. These representations transfer to downstream tasks such as spatially varying relighting and can be evaluated with a kNN classifier. When combined with a decoder and a specialized cost function, the model reconstructs dense scattering footprints with high fidelity on previously unseen objects featuring complex geometry and materials.

Core claim

The pretrained encoder learns generalizable SSS representations that transfer effectively to downstream tasks, including spatially varying relighting and representation evaluation using a kNN classifier; combined with a decoder the model reconstructs dense scattering footprint responses achieving high-fidelity reconstructions on unseen objects with complex geometry and material properties using only eight input images per view.

What carries the argument

The self-supervised pretraining framework with tailored augmentations on high-frequency phase-shift profilometry images in a multi-view, multi-object setting to learn light-transport features.

Load-bearing premise

The tailored augmentation strategy for PSP-based SSS data combined with multi-view multi-object self-supervised pretraining is sufficient to extract generalizable light-transport features without requiring additional supervision or larger input sets.

What would settle it

A demonstration that the pretrained representations fail to transfer to relighting or produce low-fidelity reconstructions on objects with material properties not represented in the pretraining data would falsify the generalization claim.

Figures

Figures reproduced from arXiv: 2606.29461 by Andreas Engelhardt, Arjun Majumdar, Hendrik PA. Lensch, Raphael Braun.

**Figure 1.** Figure 1: Left: Our method works on an uncalibrated stereo projector-camera setup. Center: We capture images of the object under eight high-frequency sine wave patterns (PSP Images). Registration of camera- to projector-pixels is achieved via phase unwrapping. Our method estimates the SSS point spread function for any given surface point from a local crop of the PSP images. Relighting is achieved by scaling those S… view at source ↗

**Figure 2.** Figure 2: Effect of SSS on PSP images. Strongly scattering materials (apple) blur projected patterns and reduce contrast due to wider scattering footprints, whereas weakly scattering objects (pear, star) preserve sharper sinusoidal structures. Encoder pretraining Decoder training Encoder pred head Shared Encoder LSSL p2 z1 Weak Aug. Strong Aug. stop grad × Encoder Decoder skip connections KNN class pred gt LSBH z [… view at source ↗

**Figure 3.** Figure 3: Overview of proposed framework. Left: SSL pretraining of shared encoder using 2 augmented PSP patches, minimizing LSSL. Right: The decoder is trained to predict SSS footprints in a supervised setting, exploiting the embedding space of the frozen encoder. It is optimized based on ground truth SSS footprints and an auxiliary classification task. Classification is done with a zero-shot kNN approach directly o… view at source ↗

**Figure 4.** Figure 4: For pre-training the Encoder with SSL we only use 8 PSP images per object. The point spread functions (PSF) of every surface point are captured in parallel with a spacing of 55 pixels, which requires 3025 images. Those images are only captured and used to train the Decoder once. The decoder generalizes to unseen objects during inference. Thus for relighting new objects we only have to capture 8 PSP images.… view at source ↗

**Figure 5.** Figure 5: Demonstration of our PSP specific strong augmentations for one input patch. which can be mathematically written as: \mathcal {\mathcal {L}_{\text {SSL}}}(p_1, z_2) = - \frac {p_1}{\|p_1\|_2} \cdot \frac {z_2}{\|z_2\|_2}. (4) 3.4 Augmentation One of the most important factors in SSL pre-training is the employed data augmentation. There are standard data augmentation pipelines that are tailored for natural i… view at source ↗

**Figure 6.** Figure 6: Qualitative comparison of SSS footprints. Note the denoised quality of the predicted strongly anisotropic footprint. the point from the PSP images, obtain their latent representation with the encoder and finally decode the latent to obtain the pixel’s SSS response footprint. Relighting then boils down to scaling those responses with the light received from the virtual projector image and splatting the sca… view at source ↗

**Figure 7.** Figure 7: Comparison: predicted vs. ground-truth SSS footprint responses, showing one scan line for a representative patch. Note the faithful prediction from the PSP input even in the tail of the scattering profile [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 8.** Figure 8: Qualitative comparisons on two test objects. Cam GT is a real photo of the object lit by the projector, Relit is the result of our relighting. Note: Those objects have never been seen during training of the encoder or decoder. Apple Crab Cam GT Relit [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗

**Figure 9.** Figure 9: Multiple relit views of two objects using the per-pixel reconstructed SSS footprints compared to camera-captured ground truth, where the real projector illuminates the scene with the same pattern. Note those objects have never been seen during training of the encoder or decoder. 4.4 kNN Representation Evaluation To evaluate the quality of the learned patch embeddings, we perform a k-nearest neighbor (kNN… view at source ↗

**Figure 10.** Figure 10: LEGO & Leaf relit under stripe illumination on unseen objects and illumination. spatially adjacent patches within the same image, we adopt a view-level leaveone-out protocol: all patches from one view are held out for evaluation, while patches from the remaining views form the feature database. For a query patch, its embedding is compared to the database using cosine similarity, and the label is predict… view at source ↗

**Figure 11.** Figure 11: PCA-RGB visualization of the learned SSS features for different objects. On the soap with heterogeneous materials, the visualization clearly marks the veins. On the hand shovel, the normal directions can be seen. 4.6 Limitations and Future Work Our method reconstructs accurate SSS footprints for the specific camera–projector configuration used during acquisition. Although it generalizes across views and g… view at source ↗

read the original abstract

We propose a self-supervised pretraining framework for learning sub-surface scattering (SSS) light transport representations from minimal input. Our method leverages a stereo projector-camera setup that captures only eight high-frequency phase-shift profilometry (PSP) images per view to pretrain an encoder in a multi-view, multi-object setting. We introduce a tailored augmentation strategy for PSP-based SSS data, and show that it significantly outperforms standard ImageNet-style augmentations for SSL pretraining. The pretrained encoder learns generalizable SSS representations that transfer effectively to downstream tasks, including spatially varying relighting and representation evaluation using a kNN classifier. Combined with a decoder, the model reconstructs dense scattering footprint responses, trained using a dedicated cost function that improves accuracy, particularly for anisotropic footprints. Despite using only eight input images per view, our approach generalizes to unseen objects with complex geometry and material properties, achieving high-fidelity reconstructions while requiring orders of magnitude fewer images than prior methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper describes an SSL method for subsurface scattering from 8 PSP images but lacks any quantitative evidence to support its claims.

read the letter

The punchline for you is that the paper describes a self-supervised pretraining method for subsurface scattering representations using minimal phase-shift inputs, but the provided abstract has no numbers or details to back up the performance claims.

What they do is set up a stereo projector-camera system to capture eight high-frequency PSP images per view, then pretrain an encoder across multiple views and objects. They add a tailored augmentation strategy that they claim works better than regular ImageNet-style ones for this data, and a custom cost function for the decoder to handle anisotropic footprints better. The pretrained model is then used for tasks like spatially varying relighting and kNN-based evaluation, and with the decoder it reconstructs dense scattering responses on unseen objects.

This is new in combining those elements for SSS specifically with such low input counts. It does a decent job highlighting the potential for practical capture setups that don't require hundreds of images like older methods.

The soft spots are significant though. The abstract talks about high-fidelity reconstructions and effective generalization but supplies zero quantitative results, error bars, or ablation studies. Without those, you can't verify if the tailored SSL is the reason for any gains or if it's something else. The central assumption that the augmentations and pretraining extract generalizable light-transport features without supervision is left untested in the text we have. That matches the stress-test note about needing to isolate whether it's really the physics or just correlations.

This kind of work would interest people doing material acquisition in computer graphics, especially those looking at efficient capture techniques. A reader might get value from the overall framework if the full paper has solid experiments, but based on what's here it's more of an idea than a validated result.

I wouldn't bring this to the next reading group because there's not enough to discuss substantively. I wouldn't cite it in my work. It doesn't seem ready for peer review either, since the evidence to support the claims isn't present. I'd recommend asking the authors for the full results or waiting for a revised version with the numbers.

Referee Report

3 major / 1 minor

Summary. The paper proposes a self-supervised pretraining framework for learning subsurface scattering (SSS) light transport representations from a stereo projector-camera setup that captures only eight high-frequency phase-shift profilometry (PSP) images per view. In a multi-view, multi-object setting, it introduces a tailored augmentation strategy for PSP-based SSS data that is claimed to significantly outperform standard ImageNet-style augmentations. The pretrained encoder is said to learn generalizable SSS representations that transfer to downstream tasks such as spatially varying relighting and kNN-based representation evaluation. Combined with a decoder and a dedicated cost function, the model reconstructs dense scattering footprint responses, generalizing to unseen objects with complex geometry and material properties while using orders of magnitude fewer images than prior methods.

Significance. If the central claims are substantiated with quantitative evidence, the work would be significant for computer vision and graphics by demonstrating that self-supervised learning with minimal PSP inputs and tailored augmentations can extract transferable light-transport features for SSS, substantially lowering data acquisition costs compared to supervised or dense-sampling approaches. The multi-view multi-object pretraining and kNN evaluation protocol could provide a template for physics-informed SSL in other light-transport domains.

major comments (3)

[Abstract] Abstract: The claim that the tailored augmentation strategy 'significantly outperforms standard ImageNet-style augmentations for SSL pretraining' is presented without any quantitative results, error bars, ablation tables, or statistical tests. This directly undermines assessment of whether the augmentation (vs. the multi-view multi-object setup or data volume alone) drives the reported generalization.
[Abstract] Abstract and downstream evaluation sections: The central claim that the encoder learns representations capturing 'subsurface scattering light transport' (rather than low-level image statistics or dataset correlations) is load-bearing, yet the manuscript offers downstream success on relighting and kNN classification as evidence without controls that isolate the tailored PSP augmentations from standard augmentations or additional data. This leaves the physics-based interpretation under-supported.
[Abstract] Abstract: The statement that the dedicated cost function 'improves accuracy, particularly for anisotropic footprints' and that the approach achieves 'high-fidelity reconstructions' on unseen objects lacks any reported metrics, comparisons to baselines, or details on the reconstruction error, making verification of the minimal-input advantage impossible from the given text.

minor comments (1)

[Abstract] The abstract refers to 'representation evaluation using a kNN classifier' without specifying the feature space, distance metric, or how the classifier is trained/evaluated on the pretrained embeddings.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful comments, which highlight opportunities to better substantiate the abstract claims with explicit references to the quantitative evidence in the main text. We address each point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: The claim that the tailored augmentation strategy 'significantly outperforms standard ImageNet-style augmentations for SSL pretraining' is presented without any quantitative results, error bars, ablation tables, or statistical tests. This directly undermines assessment of whether the augmentation (vs. the multi-view multi-object setup or data volume alone) drives the reported generalization.

Authors: We agree the abstract would benefit from direct pointers to the supporting evidence. The main manuscript contains dedicated ablation studies (Section 4.2, Table 2) that compare the tailored PSP augmentations against standard ImageNet-style augmentations on the same multi-view multi-object pretraining data, reporting downstream task metrics with standard deviations across three random seeds. We will revise the abstract to reference these results and ensure error bars are visible in the corresponding figures. revision: yes
Referee: [Abstract] Abstract and downstream evaluation sections: The central claim that the encoder learns representations capturing 'subsurface scattering light transport' (rather than low-level image statistics or dataset correlations) is load-bearing, yet the manuscript offers downstream success on relighting and kNN classification as evidence without controls that isolate the tailored PSP augmentations from standard augmentations or additional data. This leaves the physics-based interpretation under-supported.

Authors: The manuscript does provide isolating controls: Section 4.3 reports kNN classification and relighting performance for encoders pretrained with tailored PSP augmentations versus standard augmentations (and versus no pretraining), using identical data volume and architecture. These ablations are designed to separate the contribution of the augmentation strategy. We will expand the discussion in the downstream sections to more explicitly connect these controls to the light-transport interpretation and add a short paragraph addressing potential low-level statistic confounds. revision: partial
Referee: [Abstract] Abstract: The statement that the dedicated cost function 'improves accuracy, particularly for anisotropic footprints' and that the approach achieves 'high-fidelity reconstructions' on unseen objects lacks any reported metrics, comparisons to baselines, or details on the reconstruction error, making verification of the minimal-input advantage impossible from the given text.

Authors: Quantitative reconstruction results, including L1 and angular error metrics, baseline comparisons, and an ablation of the dedicated cost function, appear in Section 5 and Table 3, with particular gains shown for anisotropic cases in Figure 6. The abstract summarizes these findings. We will revise the abstract to include key metric values and explicit references to the tables and figures that substantiate the minimal-input advantage. revision: yes

Circularity Check

0 steps flagged

No circularity; claims rest on empirical transfer to downstream tasks rather than definitional reduction

full rationale

The paper describes a self-supervised pretraining pipeline that uses eight PSP images per view, tailored augmentations, and a multi-view multi-object setup to train an encoder, followed by transfer to relighting, kNN evaluation, and decoder-based reconstruction with a dedicated cost function. No load-bearing step equates a prediction to its own fitted input by construction, invokes a self-citation as an unverified uniqueness theorem, or renames an input as an output. The derivation chain is self-contained against external benchmarks (unseen objects, complex geometry/materials) and does not reduce to tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, invented entities, or detailed axioms; the central claims rest on the unverified effectiveness of the custom augmentation and the sufficiency of eight PSP images.

axioms (1)

domain assumption Eight high-frequency phase-shift images per view from a stereo projector-camera setup contain sufficient information to pretrain generalizable SSS representations.
Core premise of the input reduction strategy stated in the abstract.

pith-pipeline@v0.9.1-grok · 5704 in / 1281 out tokens · 49146 ms · 2026-06-30T07:29:53.377469+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 18 canonical work pages · 2 internal anchors

[1]

In: ICLR

Bardes, A., Ponce, J., LeCun, Y.: Vicreg: Variance-invariance-covariance regular- ization for self-supervised learning. In: ICLR. OpenReview.net (2022)

2022
[2]

In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H.T

Caron, M., Misra, I., Mairal,J., Goyal, P., Bojanowski, P., Joulin, A.: Unsupervised learning of visual features by contrasting cluster assignments. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H.T. (eds.) NeurIPS (2020)

2020
[3]

In: ICCV (2021)

Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., Joulin, A.: Emerging properties in self-supervised vision transformers. In: ICCV (2021)

2021
[4]

A Simple Framework for Contrastive Learning of Visual Representations

Chen, T., Kornblith, S., Norouzi, M., Hinton, G.E.: A simple framework for con- trastive learning of visual representations. CoRRabs/2002.05709(2020)

work page internal anchor Pith review Pith/arXiv arXiv 2002
[5]

CoRRabs/2006.10029(2020)

Chen, T., Kornblith, S., Swersky, K., Norouzi, M., Hinton, G.E.: Big self-supervised models are strong semi-supervised learners. CoRRabs/2006.10029(2020)

work page arXiv 2006
[7]

In: 2008 IEEE Conference on Computer Vision and Pattern Recognition

Chen, T., Seidel, H.P., Lensch, H.P.A.: Modulated phase-shifting for 3d scanning. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition. pp. 1–8 (2008).https://doi.org/10.1109/CVPR.2008.4587836

work page doi:10.1109/cvpr.2008.4587836 2008
[8]

In: CVPR

Chen, X., He, K.: Exploring simple siamese representation learning. In: CVPR. pp. 15750–15758. Computer Vision Foundation / IEEE (2021)

2021
[9]

ACM TOG37(2018)

Christensen, P., Fong, J., Shade, J., Wooten, W., Schubert, B., Kensler, A., Fried- man, S., Kilpatrick, C., Ramshaw, C., Bannister, M., Rayner, B., Brouillat, J., Liani, M.: Renderman: An advanced path-tracing architecture for movie render- ing. ACM TOG37(2018)

2018
[10]

In: Proceedings of the 27th annual conference on Computer graphics and interactive techniques

Debevec, P., Hawkins, T., Tchou, C., Duiker, H.P., Sarokin, W., Sagar, M.: Ac- quiring the reflectance field of a human face. In: Proceedings of the 27th annual conference on Computer graphics and interactive techniques. pp. 145–156 (2000)

2000
[11]

In: The Thirty-eighth Annual Conference on Neural Information Processing Systems (2025)

Dihlmann, J.N., Majumdar, A., Engelhardt, A., Braun, R., Lensch, H.: Subsur- face scattering for gaussian splatting. In: The Thirty-eighth Annual Conference on Neural Information Processing Systems (2025)

2025
[12]

In: Meila, M., Zhang, T

Ermolov, A., Siarohin, A., Sangineto, E., Sebe, N.: Whitening for self-supervised representation learning. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 3015–3024. PMLR (18–24 Jul 2021)

2021
[13]

ACM TOG37(2018)

Fascione, L., Hanika, J., Leone, M., Droske, M., Schwarzhaupt, J., Davidovič, T., Weidlich, A., Meng, J.: Manuka: A batch-shading architecture for spectral path tracing in movie production. ACM TOG37(2018)

2018
[14]

In: Computer Graphics Forum

Fuchs, C., Heinz, M., Levoy, M., Seidel, H.P., Lensch, H.P.: Combining confocal imaging and descattering. In: Computer Graphics Forum. vol. 27, pp. 1245–1253. Wiley Online Library (2008)

2008
[15]

In: Proceedings of the Eurographics Symposium on Rendering

Garg, G., Talvala, E.V., Levoy, M., Lensch, H.P.A.: Symmetric photography: Ex- ploiting data-sparseness in reflectance fields. In: Proceedings of the Eurographics Symposium on Rendering. pp. 251–262. Eurographics Association (2006)

2006
[16]

Journal of the Optical Society of America A39(2022)

Geiger, S., Hank, P., Kienle, A.: Improved topography reconstruction of volume scattering objects using structured light. Journal of the Optical Society of America A39(2022)

2022
[17]

In: ACM SIGGRAPH 2004 Papers, pp

Goesele, M., Lensch, H.P., Lang, J., Fuchs, C., Seidel, H.P.: Disco: acquisition of translucent objects. In: ACM SIGGRAPH 2004 Papers, pp. 835–844. ACM (2004) From Phase to Phenomenon 17

2004
[18]

In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H.T

Grill, J.B., Strub, F., Altché, F., Tallec, C., Richemond, P.H., Buchatskaya, E., Doersch, C., Pires, B.Á., Guo, Z., Azar, M.G., Piot, B., Kavukcuoglu, K., Munos, R., Valko, M.: Bootstrap your own latent - a new approach to self-supervised learning. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H.T. (eds.) NeurIPS (2020)

2020
[19]

In: 2012 IEEE Conference on Com- puter Vision and Pattern Recognition

Gupta, M., Nayar, S.K.: Micro phase shifting. In: 2012 IEEE Conference on Com- puter Vision and Pattern Recognition. pp. 813–820 (2012).https://doi.org/10. 1109/CVPR.2012.6247753

work page arXiv 2012
[20]

In: CVPR

He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.B.: Momentum contrast for unsuper- vised visual representation learning. In: CVPR. pp. 9726–9735. Computer Vision Foundation / IEEE (2020)

2020
[21]

In: Leibe, B., Matas, J., Sebe, N., Welling, M

He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision – ECCV
[22]

pp. 630–645. Springer International Publishing, Cham (2016)

2016
[23]

Optics and Lasers in Engineering142 (2021).https://doi.org/10.1016/j.optlaseng.2021.106613

He, X., Kemao, Q.: A comparative study on temporal phase unwrapping methods in high-speed fringe projection profilometry. Optics and Lasers in Engineering142 (2021).https://doi.org/10.1016/j.optlaseng.2021.106613

work page doi:10.1016/j.optlaseng.2021.106613 2021
[24]

In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques

Jensen, H.W., Marschner, S.R., Levoy, M., Hanrahan, P.: A practical model for subsurface light transport. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. pp. 511–518. SIGGRAPH ’01, Association for Computing Machinery, New York, NY, USA (2001).https: //doi.org/10.1145/383259.383319

work page doi:10.1145/383259.383319 2001
[25]

In: Proceedings of the 28th annual conference on Com- puter graphics and interactive techniques

Jensen, H.W., Marschner, S.R., Levoy, M., Hanrahan, P.: A practical model for subsurface light transport. In: Proceedings of the 28th annual conference on Com- puter graphics and interactive techniques. pp. 511–518 (2001)

2001
[26]

Applied optics35(13), 2304–2314 (1996)

Kienle, A., Lilge, L., Patterson, M.S., Hibst, R., Steiner, R., Wilson, B.C.: Spatially resolved absolute diffuse reflectance measurements for noninvasive determination of the optical scattering and absorption coefficients of biological tissue. Applied optics35(13), 2304–2314 (1996)

1996
[27]

In: Biomedical Imaging and Sensing Conference

Kikuchi, K., Katsuyama, M., Shibata, T., Hardeberg, J., Yuasa, T., Aizu, Y.: Development of measurement system for subsurface scattering light of skin and analysis of its age-related changes. In: Biomedical Imaging and Sensing Conference. vol. 12608, pp. 123–126. SPIE (2023)

2023
[28]

Kirillov, E

Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.Y., Dollár, P., Girshick, R.: Segment anything. 2023 IEEE/CVF International Conference on Computer Vision (ICCV) (2023). https://doi.org/10.1109/iccv51070.2023.00371

work page doi:10.1109/iccv51070.2023.00371 2023
[29]

In: ICCV

Koohpayegani, S.A., Tejankar, A., Pirsiavash, H.: Mean shift for self-supervised learning. In: ICCV. pp. 10306–10315. IEEE (2021)

2021
[30]

In: European Conference on Computer Vision

Lyu, L., Tewari, A., Leimkühler, T., Habermann, M., Theobalt, C.: Neural radi- ance transfer fields for relightable novel-view synthesis with global illumination. In: European Conference on Computer Vision. pp. 153–169. Springer (2022)

2022
[31]

In: Egger, B., Günther, T

Majumdar, A., Braun, R., Lensch, H.: Neural Acquisition & Representation of Subsurface Scattering. In: Egger, B., Günther, T. (eds.) Vision, Modeling, and Visualization. The Eurographics Association (2025).https://doi.org/10.2312/ vmv.20251228

2025
[32]

In: Proceedings of the 2004 IEEE Interna- tional Conference on Acoustics, Speech, and Signal Processing (ICASSP)

Malvar, H.S., wei He, L., Cutler, R.: High-quality linear interpolation for demo- saicing of Bayer-patterned color images. In: Proceedings of the 2004 IEEE Interna- tional Conference on Acoustics, Speech, and Signal Processing (ICASSP). vol. 4, pp. iv–485–8 (May 2004).https://doi.org/10.1109/ICASSP.2004.1326587 18 Majumdar et al

work page doi:10.1109/icassp.2004.1326587 2004
[33]

In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019

Moreno, D., Son, K., Taubin, G.: Embedded phase shifting: Robust phase shifting with embedded signals. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2301–2309 (2015).https://doi.org/10.1109/CVPR. 2015.7298843

work page doi:10.1109/cvpr 2015
[34]

ACM Trans

Nayar, S., Krishnan, G., Grossberg, M., Raskar, R.: Fast separation of direct and global components of a scene using high frequency illumination. ACM Trans. Graph.25, 935–944 (Jul 2006).https://doi.org/10.1145/1179352.1141977

work page doi:10.1145/1179352.1141977 2006
[35]

ACM Trans

O’Toole, M., Kutulakos, K.N.: Optical computing for fast light transport analysis. ACM Trans. Graph.29(6), 164 (2010)

2010
[36]

ACM Trans

O’Toole, M., Raskar, R., Kutulakos, K.N.: Primal-dual coding to probe light trans- port. ACM Trans. Graph.31(4), 39–1 (2012)

2012
[37]

In: Proceedings of the 14th Eurographics workshop on Rendering

Peers, P., Dutré, P.: Wavelet environment matting. In: Proceedings of the 14th Eurographics workshop on Rendering. pp. 157–166 (2003)

2003
[38]

ACM Transactions on Graphics (ToG)28(1), 1–18 (2009)

Peers, P., Mahajan, D.K., Lamond, B., Ghosh, A., Matusik, W., Ramamoorthi, R., Debevec, P.: Compressive light transport sensing. ACM Transactions on Graphics (ToG)28(1), 1–18 (2009)

2009
[39]

U-Net: Convolutional Networks for Biomedical Image Segmentation

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomed- ical image segmentation. CoRRabs/1505.04597(2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015
[40]

ACM Transactions on Graphics24(3), 745–755 (2005).https://doi.org/10.1145/1073204.1073257

Sen,P.,Chen,B.,Garg,G.,Marschner,S.,Sen,M.,Zoras,P.,Lunsford,P.,Genetti, J., Levoy, M.: Dual photography. ACM Transactions on Graphics24(3), 745–755 (2005).https://doi.org/10.1145/1073204.1073257

work page doi:10.1145/1073204.1073257 2005
[41]

In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert, D., Wang, Z.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1874–1883 (2016).https://doi. org/10.1109/CVPR.2016.207

work page doi:10.1109/cvpr.2016.207 2016
[42]

ACM TOG38(2019)

Vicini, D., Koltun, V., Jakob, W.: A learned shape-adaptive subsurface scattering model. ACM TOG38(2019)

2019
[43]

ACM Transactions on Graphics (ToG) 25(3), 1013–1024 (2006)

Weyrich, T., Matusik, W., Pfister, H., Bickel, B., Donner, C., Tu, C., McAnd- less, J., Lee, J., Ngan, A., Jensen, H.W., et al.: Analysis of human faces using a measurement-based skin reflectance model. ACM Transactions on Graphics (ToG) 25(3), 1013–1024 (2006)

2006
[44]

In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non- parametric instance discrimination. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 3733–3742 (2018).https://doi.org/10. 1109/CVPR.2018.00393

work page arXiv 2018
[45]

Yu, H.X., Guo, M., Fathi, A., Chang, Y.Y., Chan, E., Gao, R., Funkhouser, T., Wu, J.: Learning object-centric neural scattering functions for free-viewpoint relighting and scene composition. Trans. Mach. Learn. Res. (2023).https://doi.org/10. 48550/arXiv.2303.06138

work page arXiv 2023
[46]

In: Meila, M., Zhang, T

Zbontar, J., Jing, L., Misra, I., LeCun, Y., Deny, S.: Barlow twins: Self-supervised learning via redundancy reduction. In: Meila, M., Zhang, T. (eds.) ICML. Pro- ceedings of Machine Learning Research, vol. 139, pp. 12310–12320. PMLR (2021)

2021
[47]

Advances in Neural Information Processing Systems34, 15203–15215 (2021)

Zheng,Q.,Singh,G.,Seidel,H.P.:Neuralrelightableparticipatingmediarendering. Advances in Neural Information Processing Systems34, 15203–15215 (2021)

2021
[48]

arXiv preprint arXiv:2306.09322 (2023)

Zhu, S., Saito, S., Bozic, A., Aliaga, C., Darrell, T., Lassner, C.: Neural relight- ing with subsurface scattering by learning the radiance transfer gradient. arXiv preprint arXiv:2306.09322 (2023)

work page arXiv 2023
[49]

Optics and Lasers in Engineering109 (2018).https://doi.org/10.1016/j.optlaseng.2018.04.019 From Phase to Phenomenon 19

Zuo, C., Feng, S., Huang, L., Tao, T., Yin, W., Chen, Q.: Phase shifting algorithms for fringe projection profilometry: A review. Optics and Lasers in Engineering109 (2018).https://doi.org/10.1016/j.optlaseng.2018.04.019 From Phase to Phenomenon 19

work page doi:10.1016/j.optlaseng.2018.04.019 2018
[50]

Optics and Lasers in Engineering85(2016).https://doi.org/10.1016/j.optlaseng.2016

Zuo, C., Huang, L., Zhang, M., Chen, Q., Asundi, A.: Temporal phase unwrapping algorithms for fringe projection profilometry: A comparative review. Optics and Lasers in Engineering85(2016).https://doi.org/10.1016/j.optlaseng.2016. 04.022

work page doi:10.1016/j.optlaseng.2016 2016

[1] [1]

In: ICLR

Bardes, A., Ponce, J., LeCun, Y.: Vicreg: Variance-invariance-covariance regular- ization for self-supervised learning. In: ICLR. OpenReview.net (2022)

2022

[2] [2]

In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H.T

Caron, M., Misra, I., Mairal,J., Goyal, P., Bojanowski, P., Joulin, A.: Unsupervised learning of visual features by contrasting cluster assignments. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H.T. (eds.) NeurIPS (2020)

2020

[3] [3]

In: ICCV (2021)

Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., Joulin, A.: Emerging properties in self-supervised vision transformers. In: ICCV (2021)

2021

[4] [4]

A Simple Framework for Contrastive Learning of Visual Representations

Chen, T., Kornblith, S., Norouzi, M., Hinton, G.E.: A simple framework for con- trastive learning of visual representations. CoRRabs/2002.05709(2020)

work page internal anchor Pith review Pith/arXiv arXiv 2002

[5] [5]

CoRRabs/2006.10029(2020)

Chen, T., Kornblith, S., Swersky, K., Norouzi, M., Hinton, G.E.: Big self-supervised models are strong semi-supervised learners. CoRRabs/2006.10029(2020)

work page arXiv 2006

[6] [7]

In: 2008 IEEE Conference on Computer Vision and Pattern Recognition

Chen, T., Seidel, H.P., Lensch, H.P.A.: Modulated phase-shifting for 3d scanning. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition. pp. 1–8 (2008).https://doi.org/10.1109/CVPR.2008.4587836

work page doi:10.1109/cvpr.2008.4587836 2008

[7] [8]

In: CVPR

Chen, X., He, K.: Exploring simple siamese representation learning. In: CVPR. pp. 15750–15758. Computer Vision Foundation / IEEE (2021)

2021

[8] [9]

ACM TOG37(2018)

Christensen, P., Fong, J., Shade, J., Wooten, W., Schubert, B., Kensler, A., Fried- man, S., Kilpatrick, C., Ramshaw, C., Bannister, M., Rayner, B., Brouillat, J., Liani, M.: Renderman: An advanced path-tracing architecture for movie render- ing. ACM TOG37(2018)

2018

[9] [10]

In: Proceedings of the 27th annual conference on Computer graphics and interactive techniques

Debevec, P., Hawkins, T., Tchou, C., Duiker, H.P., Sarokin, W., Sagar, M.: Ac- quiring the reflectance field of a human face. In: Proceedings of the 27th annual conference on Computer graphics and interactive techniques. pp. 145–156 (2000)

2000

[10] [11]

In: The Thirty-eighth Annual Conference on Neural Information Processing Systems (2025)

Dihlmann, J.N., Majumdar, A., Engelhardt, A., Braun, R., Lensch, H.: Subsur- face scattering for gaussian splatting. In: The Thirty-eighth Annual Conference on Neural Information Processing Systems (2025)

2025

[11] [12]

In: Meila, M., Zhang, T

Ermolov, A., Siarohin, A., Sangineto, E., Sebe, N.: Whitening for self-supervised representation learning. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 3015–3024. PMLR (18–24 Jul 2021)

2021

[12] [13]

ACM TOG37(2018)

Fascione, L., Hanika, J., Leone, M., Droske, M., Schwarzhaupt, J., Davidovič, T., Weidlich, A., Meng, J.: Manuka: A batch-shading architecture for spectral path tracing in movie production. ACM TOG37(2018)

2018

[13] [14]

In: Computer Graphics Forum

Fuchs, C., Heinz, M., Levoy, M., Seidel, H.P., Lensch, H.P.: Combining confocal imaging and descattering. In: Computer Graphics Forum. vol. 27, pp. 1245–1253. Wiley Online Library (2008)

2008

[14] [15]

In: Proceedings of the Eurographics Symposium on Rendering

Garg, G., Talvala, E.V., Levoy, M., Lensch, H.P.A.: Symmetric photography: Ex- ploiting data-sparseness in reflectance fields. In: Proceedings of the Eurographics Symposium on Rendering. pp. 251–262. Eurographics Association (2006)

2006

[15] [16]

Journal of the Optical Society of America A39(2022)

Geiger, S., Hank, P., Kienle, A.: Improved topography reconstruction of volume scattering objects using structured light. Journal of the Optical Society of America A39(2022)

2022

[16] [17]

In: ACM SIGGRAPH 2004 Papers, pp

Goesele, M., Lensch, H.P., Lang, J., Fuchs, C., Seidel, H.P.: Disco: acquisition of translucent objects. In: ACM SIGGRAPH 2004 Papers, pp. 835–844. ACM (2004) From Phase to Phenomenon 17

2004

[17] [18]

In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H.T

Grill, J.B., Strub, F., Altché, F., Tallec, C., Richemond, P.H., Buchatskaya, E., Doersch, C., Pires, B.Á., Guo, Z., Azar, M.G., Piot, B., Kavukcuoglu, K., Munos, R., Valko, M.: Bootstrap your own latent - a new approach to self-supervised learning. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H.T. (eds.) NeurIPS (2020)

2020

[18] [19]

In: 2012 IEEE Conference on Com- puter Vision and Pattern Recognition

Gupta, M., Nayar, S.K.: Micro phase shifting. In: 2012 IEEE Conference on Com- puter Vision and Pattern Recognition. pp. 813–820 (2012).https://doi.org/10. 1109/CVPR.2012.6247753

work page arXiv 2012

[19] [20]

In: CVPR

He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.B.: Momentum contrast for unsuper- vised visual representation learning. In: CVPR. pp. 9726–9735. Computer Vision Foundation / IEEE (2020)

2020

[20] [21]

In: Leibe, B., Matas, J., Sebe, N., Welling, M

He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision – ECCV

[21] [22]

pp. 630–645. Springer International Publishing, Cham (2016)

2016

[22] [23]

Optics and Lasers in Engineering142 (2021).https://doi.org/10.1016/j.optlaseng.2021.106613

He, X., Kemao, Q.: A comparative study on temporal phase unwrapping methods in high-speed fringe projection profilometry. Optics and Lasers in Engineering142 (2021).https://doi.org/10.1016/j.optlaseng.2021.106613

work page doi:10.1016/j.optlaseng.2021.106613 2021

[23] [24]

In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques

Jensen, H.W., Marschner, S.R., Levoy, M., Hanrahan, P.: A practical model for subsurface light transport. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. pp. 511–518. SIGGRAPH ’01, Association for Computing Machinery, New York, NY, USA (2001).https: //doi.org/10.1145/383259.383319

work page doi:10.1145/383259.383319 2001

[24] [25]

In: Proceedings of the 28th annual conference on Com- puter graphics and interactive techniques

Jensen, H.W., Marschner, S.R., Levoy, M., Hanrahan, P.: A practical model for subsurface light transport. In: Proceedings of the 28th annual conference on Com- puter graphics and interactive techniques. pp. 511–518 (2001)

2001

[25] [26]

Applied optics35(13), 2304–2314 (1996)

Kienle, A., Lilge, L., Patterson, M.S., Hibst, R., Steiner, R., Wilson, B.C.: Spatially resolved absolute diffuse reflectance measurements for noninvasive determination of the optical scattering and absorption coefficients of biological tissue. Applied optics35(13), 2304–2314 (1996)

1996

[26] [27]

In: Biomedical Imaging and Sensing Conference

Kikuchi, K., Katsuyama, M., Shibata, T., Hardeberg, J., Yuasa, T., Aizu, Y.: Development of measurement system for subsurface scattering light of skin and analysis of its age-related changes. In: Biomedical Imaging and Sensing Conference. vol. 12608, pp. 123–126. SPIE (2023)

2023

[27] [28]

Kirillov, E

Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.Y., Dollár, P., Girshick, R.: Segment anything. 2023 IEEE/CVF International Conference on Computer Vision (ICCV) (2023). https://doi.org/10.1109/iccv51070.2023.00371

work page doi:10.1109/iccv51070.2023.00371 2023

[28] [29]

In: ICCV

Koohpayegani, S.A., Tejankar, A., Pirsiavash, H.: Mean shift for self-supervised learning. In: ICCV. pp. 10306–10315. IEEE (2021)

2021

[29] [30]

In: European Conference on Computer Vision

Lyu, L., Tewari, A., Leimkühler, T., Habermann, M., Theobalt, C.: Neural radi- ance transfer fields for relightable novel-view synthesis with global illumination. In: European Conference on Computer Vision. pp. 153–169. Springer (2022)

2022

[30] [31]

In: Egger, B., Günther, T

Majumdar, A., Braun, R., Lensch, H.: Neural Acquisition & Representation of Subsurface Scattering. In: Egger, B., Günther, T. (eds.) Vision, Modeling, and Visualization. The Eurographics Association (2025).https://doi.org/10.2312/ vmv.20251228

2025

[31] [32]

In: Proceedings of the 2004 IEEE Interna- tional Conference on Acoustics, Speech, and Signal Processing (ICASSP)

Malvar, H.S., wei He, L., Cutler, R.: High-quality linear interpolation for demo- saicing of Bayer-patterned color images. In: Proceedings of the 2004 IEEE Interna- tional Conference on Acoustics, Speech, and Signal Processing (ICASSP). vol. 4, pp. iv–485–8 (May 2004).https://doi.org/10.1109/ICASSP.2004.1326587 18 Majumdar et al

work page doi:10.1109/icassp.2004.1326587 2004

[32] [33]

In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019

Moreno, D., Son, K., Taubin, G.: Embedded phase shifting: Robust phase shifting with embedded signals. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2301–2309 (2015).https://doi.org/10.1109/CVPR. 2015.7298843

work page doi:10.1109/cvpr 2015

[33] [34]

ACM Trans

Nayar, S., Krishnan, G., Grossberg, M., Raskar, R.: Fast separation of direct and global components of a scene using high frequency illumination. ACM Trans. Graph.25, 935–944 (Jul 2006).https://doi.org/10.1145/1179352.1141977

work page doi:10.1145/1179352.1141977 2006

[34] [35]

ACM Trans

O’Toole, M., Kutulakos, K.N.: Optical computing for fast light transport analysis. ACM Trans. Graph.29(6), 164 (2010)

2010

[35] [36]

ACM Trans

O’Toole, M., Raskar, R., Kutulakos, K.N.: Primal-dual coding to probe light trans- port. ACM Trans. Graph.31(4), 39–1 (2012)

2012

[36] [37]

In: Proceedings of the 14th Eurographics workshop on Rendering

Peers, P., Dutré, P.: Wavelet environment matting. In: Proceedings of the 14th Eurographics workshop on Rendering. pp. 157–166 (2003)

2003

[37] [38]

ACM Transactions on Graphics (ToG)28(1), 1–18 (2009)

Peers, P., Mahajan, D.K., Lamond, B., Ghosh, A., Matusik, W., Ramamoorthi, R., Debevec, P.: Compressive light transport sensing. ACM Transactions on Graphics (ToG)28(1), 1–18 (2009)

2009

[38] [39]

U-Net: Convolutional Networks for Biomedical Image Segmentation

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomed- ical image segmentation. CoRRabs/1505.04597(2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015

[39] [40]

ACM Transactions on Graphics24(3), 745–755 (2005).https://doi.org/10.1145/1073204.1073257

Sen,P.,Chen,B.,Garg,G.,Marschner,S.,Sen,M.,Zoras,P.,Lunsford,P.,Genetti, J., Levoy, M.: Dual photography. ACM Transactions on Graphics24(3), 745–755 (2005).https://doi.org/10.1145/1073204.1073257

work page doi:10.1145/1073204.1073257 2005

[40] [41]

In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert, D., Wang, Z.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1874–1883 (2016).https://doi. org/10.1109/CVPR.2016.207

work page doi:10.1109/cvpr.2016.207 2016

[41] [42]

ACM TOG38(2019)

Vicini, D., Koltun, V., Jakob, W.: A learned shape-adaptive subsurface scattering model. ACM TOG38(2019)

2019

[42] [43]

ACM Transactions on Graphics (ToG) 25(3), 1013–1024 (2006)

Weyrich, T., Matusik, W., Pfister, H., Bickel, B., Donner, C., Tu, C., McAnd- less, J., Lee, J., Ngan, A., Jensen, H.W., et al.: Analysis of human faces using a measurement-based skin reflectance model. ACM Transactions on Graphics (ToG) 25(3), 1013–1024 (2006)

2006

[43] [44]

In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non- parametric instance discrimination. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 3733–3742 (2018).https://doi.org/10. 1109/CVPR.2018.00393

work page arXiv 2018

[44] [45]

Yu, H.X., Guo, M., Fathi, A., Chang, Y.Y., Chan, E., Gao, R., Funkhouser, T., Wu, J.: Learning object-centric neural scattering functions for free-viewpoint relighting and scene composition. Trans. Mach. Learn. Res. (2023).https://doi.org/10. 48550/arXiv.2303.06138

work page arXiv 2023

[45] [46]

In: Meila, M., Zhang, T

Zbontar, J., Jing, L., Misra, I., LeCun, Y., Deny, S.: Barlow twins: Self-supervised learning via redundancy reduction. In: Meila, M., Zhang, T. (eds.) ICML. Pro- ceedings of Machine Learning Research, vol. 139, pp. 12310–12320. PMLR (2021)

2021

[46] [47]

Advances in Neural Information Processing Systems34, 15203–15215 (2021)

Zheng,Q.,Singh,G.,Seidel,H.P.:Neuralrelightableparticipatingmediarendering. Advances in Neural Information Processing Systems34, 15203–15215 (2021)

2021

[47] [48]

arXiv preprint arXiv:2306.09322 (2023)

Zhu, S., Saito, S., Bozic, A., Aliaga, C., Darrell, T., Lassner, C.: Neural relight- ing with subsurface scattering by learning the radiance transfer gradient. arXiv preprint arXiv:2306.09322 (2023)

work page arXiv 2023

[48] [49]

Optics and Lasers in Engineering109 (2018).https://doi.org/10.1016/j.optlaseng.2018.04.019 From Phase to Phenomenon 19

Zuo, C., Feng, S., Huang, L., Tao, T., Yin, W., Chen, Q.: Phase shifting algorithms for fringe projection profilometry: A review. Optics and Lasers in Engineering109 (2018).https://doi.org/10.1016/j.optlaseng.2018.04.019 From Phase to Phenomenon 19

work page doi:10.1016/j.optlaseng.2018.04.019 2018

[49] [50]

Optics and Lasers in Engineering85(2016).https://doi.org/10.1016/j.optlaseng.2016

Zuo, C., Huang, L., Zhang, M., Chen, Q., Asundi, A.: Temporal phase unwrapping algorithms for fringe projection profilometry: A comparative review. Optics and Lasers in Engineering85(2016).https://doi.org/10.1016/j.optlaseng.2016. 04.022

work page doi:10.1016/j.optlaseng.2016 2016