Physically Plausible Human-Object Rendering from Sparse Views via 3D Gaussian Splatting
Pith reviewed 2026-05-23 01:02 UTC · model grok-4.3
The pith
HOGS renders physically plausible human-object interactions from sparse views by optimizing dynamic 3D Gaussians with contact constraints.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
HOGS represents both humans and objects as dynamic 3D Gaussians. A novel optimization process operates directly on these Gaussians to enforce geometric consistency, preventing inter-penetration or floating contacts, thereby achieving physical plausibility. Two pre-trained modules—an optimization-guided Human Pose Refiner and a Human-Object Contact Predictor—supply accurate pose and contact estimates to support the optimization under sparse-view ambiguity.
What carries the argument
Dynamic 3D Gaussians optimized via contact and separation losses, guided by a Human Pose Refiner and Human-Object Contact Predictor.
If this is right
- Enables rendering of human-object and hand-object scenes from sparse views while maintaining physical plausibility.
- Achieves state-of-the-art rendering quality alongside high computational efficiency.
- The direct Gaussian optimization enforces no inter-penetration and proper contact without post-processing.
- The framework supports both full-body and hand-scale interactions on existing datasets.
Where Pith is reading between the lines
- The same Gaussian representation and loss structure could apply to other dynamic scenes requiring geometric constraints, such as multi-object stacking.
- If the contact predictor generalizes, the method may reduce reliance on dense views in real-world capture setups.
- Efficiency gains suggest possible use in interactive applications where both realism and speed matter.
- Failure modes in the refiner module would likely appear first under heavy occlusion or unusual poses.
Load-bearing premise
The pre-trained pose refiner and contact predictor modules produce sufficiently accurate estimates from sparse views to guide the losses without introducing new errors.
What would settle it
If rendered outputs on the test datasets exhibit interpenetrations or floating contacts where ground-truth interactions show touching, or if rendering metrics fall below prior methods.
Figures
read the original abstract
Rendering realistic human-object interactions (HOIs) from sparse-view inputs is a challenging yet crucial task for various real-world applications. Existing methods often struggle to simultaneously achieve high rendering quality, physical plausibility, and computational efficiency. To address these limitations, we propose HOGS (Human-Object Rendering via 3D Gaussian Splatting), a novel framework for efficient HOI rendering with physically plausible geometric constraints from sparse views. HOGS represents both humans and objects as dynamic 3D Gaussians. Central to HOGS is a novel optimization process that operates directly on these Gaussians to enforce geometric consistency (i.e., preventing inter-penetration or floating contacts) to achieve physical plausibility. To support this core optimization under sparse-view ambiguity, our framework incorporates two pre-trained modules: an optimization-guided Human Pose Refiner for robust estimation under sparse-view occlusions, and a Human-Object Contact Predictor that efficiently identifies interaction regions to guide our novel contact and separation losses. Extensive experiments on both human-object and hand-object interaction datasets demonstrate that HOGS achieves state-of-the-art rendering quality and maintains high computational efficiency.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents HOGS, a framework for rendering human-object interactions (HOIs) from sparse views. It represents humans and objects as dynamic 3D Gaussians and performs optimization directly on these Gaussians to enforce physical plausibility via novel contact and separation losses that prevent inter-penetration and floating contacts. Two fixed pre-trained modules—an optimization-guided Human Pose Refiner and a Human-Object Contact Predictor—supply the contact regions and refined poses that guide the losses under sparse-view ambiguity. Experiments on human-object and hand-object interaction datasets are reported to achieve state-of-the-art rendering quality while maintaining computational efficiency.
Significance. If the pre-trained modules prove reliable under the targeted sparse-view occlusions, the approach could advance efficient, physically constrained rendering of interactions by combining dynamic 3D Gaussian Splatting with geometric losses. The direct optimization on Gaussians and the use of contact-aware terms address limitations in prior methods. The significance is limited, however, by the absence of independent validation for the modules that supply the physical constraints.
major comments (1)
- [Sections 3.3 and 3.4] Sections 3.3 and 3.4: The contact and separation losses are defined directly on the outputs of the fixed pre-trained Human Pose Refiner and Human-Object Contact Predictor. No ablation studies isolate the accuracy of these modules (pose error, contact precision/recall) on sparse-view data against ground truth. Because the modules remain frozen during Gaussian optimization, errors they produce under occlusion would propagate into the physical-plausibility constraints with no independent recovery mechanism, undermining the central claim that the optimization enforces geometric consistency.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address the major comment point by point below.
read point-by-point responses
-
Referee: Sections 3.3 and 3.4: The contact and separation losses are defined directly on the outputs of the fixed pre-trained Human Pose Refiner and Human-Object Contact Predictor. No ablation studies isolate the accuracy of these modules (pose error, contact precision/recall) on sparse-view data against ground truth. Because the modules remain frozen during Gaussian optimization, errors they produce under occlusion would propagate into the physical-plausibility constraints with no independent recovery mechanism, undermining the central claim that the optimization enforces geometric consistency.
Authors: We acknowledge that the current manuscript does not include isolated ablation studies evaluating the pose error or contact precision/recall of the fixed pre-trained Human Pose Refiner and Human-Object Contact Predictor specifically on sparse-view inputs against ground truth. The modules are indeed held fixed during the Gaussian optimization, as stated in Sections 3.3 and 3.4, so any inaccuracies under heavy occlusion would directly influence the contact and separation losses. Our defense of the central claim rests on the end-to-end experimental results: HOGS achieves state-of-the-art rendering quality and physical-plausibility metrics on both human-object and hand-object datasets, outperforming baselines that lack these geometric constraints. This indicates that the overall optimization produces plausible outputs in practice. To strengthen the presentation, we will add the requested module-level ablations (pose error and contact metrics on sparse-view test data) to the revised manuscript. revision: yes
Circularity Check
No circularity: framework uses external pre-trained modules and novel losses
full rationale
The paper presents HOGS as a forward proposal that represents humans and objects as dynamic 3D Gaussians and introduces a new optimization process with contact and separation losses. These losses are guided by two explicitly pre-trained modules (Human Pose Refiner and Human-Object Contact Predictor) described as fixed inputs. No derivation, equation, or claim in the provided text reduces a performance quantity to a fitted parameter from the same data, renames a known result, or relies on a load-bearing self-citation chain. The method is therefore self-contained against external benchmarks and the central rendering claims do not collapse by construction.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
Rendering Multi-Human and Multi-Object with 3D Gaussian Splatting
MM-GS combines per-instance multi-view fusion with scene-level interaction modeling on 3D Gaussians to render high-fidelity multi-human multi-object scenes from sparse views.
Reference graph
Works this paper leans on
-
[1]
Differentiable render- ing of neural sdfs through reparameterization
Sai Praveen Bangaru, Michael Gharbi, Fujun Luan, Tzu-Mao Li, Kalyan Sunkavalli, Milos Hasan, Sai Bi, Zexiang Xu, Gilbert Bernstein, and Fredo Durand. Differentiable render- ing of neural sdfs through reparameterization. InSIGGRAPH Asia 2022 Conference Papers, pages 1–9, 2022. 3
work page 2022
-
[2]
4d visualization of dynamic events from unconstrained multi-view videos
Aayush Bansal, Minh V o, Yaser Sheikh, Deva Ramanan, and Srinivasa Narasimhan. 4d visualization of dynamic events from unconstrained multi-view videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5366–5375, 2020. 2
work page 2020
-
[3]
Interaction networks for learning about objects, relations and physics
Peter Battaglia, Razvan Pascanu, Matthew Lai, Danilo Jimenez Rezende, et al. Interaction networks for learning about objects, relations and physics. Advances in neural in- formation processing systems, 29, 2016. 3
work page 2016
-
[4]
Method for registration of 3-d shapes
Paul J Besl and Neil D McKay. Method for registration of 3-d shapes. In Sensor fusion IV: control paradigms and data structures, pages 586–606. Spie, 1992. 5
work page 1992
-
[5]
Behave: Dataset and method for tracking human object in- teractions
Bharat Lal Bhatnagar, Xianghui Xie, Ilya A Petrov, Cristian Sminchisescu, Christian Theobalt, and Gerard Pons-Moll. Behave: Dataset and method for tracking human object in- teractions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15935– 15946, 2022. 2
work page 2022
-
[6]
Keep it smpl: Automatic estimation of 3d human pose and shape from a single image
Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter Gehler, Javier Romero, and Michael J Black. Keep it smpl: Automatic estimation of 3d human pose and shape from a single image. In Computer Vision–ECCV 2016: 14th Euro- pean Conference, Amsterdam, The Netherlands, October 11- 14, 2016, Proceedings, Part V 14, pages 561–578. Springer,
work page 2016
-
[7]
Flashback: Immersive virtual reality on mobile devices via rendering memoization
Kevin Boos, David Chu, and Eduardo Cuervo. Flashback: Immersive virtual reality on mobile devices via rendering memoization. In Proceedings of the 14th Annual Interna- tional Conference on Mobile Systems, Applications, and Ser- vices, pages 291–304, 2016. 2
work page 2016
-
[8]
Hexplane: A fast representa- tion for dynamic scenes
Ang Cao and Justin Johnson. Hexplane: A fast representa- tion for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 130–141, 2023. 3
work page 2023
-
[9]
Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images
Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, and Jianfei Cai. Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images. In European Conference on Computer Vision, pages 370–386. Springer, 2025. 2
work page 2025
-
[10]
High-quality streamable free-viewpoint video
Alvaro Collet, Ming Chuang, Pat Sweeney, Don Gillett, Den- nis Evseev, David Calabrese, Hugues Hoppe, Adam Kirk, and Steve Sullivan. High-quality streamable free-viewpoint video. ACM Transactions on Graphics (ToG) , 34(4):1–13,
-
[11]
point diffu- sion implicit function for large-scale scene neural represen- tation
Yuhan Ding, Fukun Yin, Jiayuan Fan, Hui Li, Xin Chen, Wen Liu, Chongshan Lu, Gang Yu, and Tao Chen. point diffu- sion implicit function for large-scale scene neural represen- tation. Advances in Neural Information Processing Systems, 36, 2024. 3
work page 2024
-
[12]
Motion2fusion: Real-time volumetric performance capture
Mingsong Dou, Philip Davidson, Sean Ryan Fanello, Sameh Khamis, Adarsh Kowdle, Christoph Rhemann, Vladimir Tankovich, and Shahram Izadi. Motion2fusion: Real-time volumetric performance capture. ACM Transactions on Graphics (ToG), 36(6):1–16, 2017. 2
work page 2017
-
[13]
3d gaussian splatting as new era: A survey
Ben Fei, Jingyi Xu, Rui Zhang, Qingyuan Zhou, Weidong Yang, and Ying He. 3d gaussian splatting as new era: A survey. IEEE Transactions on Visualization and Computer Graphics, 2024. 2
work page 2024
-
[14]
Associated reality: A cognitive human–machine layer for autonomous driving
Felipe Fernandez, Angel Sanchez, Jose F Velez, and Belen Moreno. Associated reality: A cognitive human–machine layer for autonomous driving. Robotics and Autonomous Systems, 133:103624, 2020. 1
work page 2020
-
[15]
Guillaume Gourmelen, Shutaro Toriya, Eiko Miya, Naohisa Shioura, and Hiroyasu Iwata. Miruoto: Sports event atmo- sphere visual rendering through real-time image and sound processing system. In ACM SIGGRAPH 2024 Emerging Technologies, pages 1–2. 2024. 1
work page 2024
-
[16]
Observing human-object interactions: Using spatial and functional compatibility for recognition
Abhinav Gupta, Aniruddha Kembhavi, and Larry S Davis. Observing human-object interactions: Using spatial and functional compatibility for recognition. IEEE transactions on pattern analysis and machine intelligence , 31(10):1775– 1789, 2009. 3
work page 2009
-
[17]
Resolving 3d human pose ambiguities with 3d scene constraints
Mohamed Hassan, Vasileios Choutas, Dimitrios Tzionas, and Michael J Black. Resolving 3d human pose ambiguities with 3d scene constraints. In Proceedings of the IEEE/CVF international conference on computer vision , pages 2282– 2292, 2019. 3
work page 2019
-
[18]
Populating 3d scenes by learning human-scene interaction
Mohamed Hassan, Partha Ghosh, Joachim Tesch, Dim- itrios Tzionas, and Michael J Black. Populating 3d scenes by learning human-scene interaction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14708–14718, 2021. 3
work page 2021
-
[19]
Haoyu Hu, Xinyu Yi, Zhe Cao, Jun-Hai Yong, and Feng Xu. Hand-object interaction controller (hoic): Deep reinforce- ment learning for reconstructing interactions with physics. In ACM SIGGRAPH 2024 Conference Papers , pages 1–10,
work page 2024
-
[20]
Gauhuman: Articu- lated gaussian splatting from monocular human videos
Shoukang Hu, Tao Hu, and Ziwei Liu. Gauhuman: Articu- lated gaussian splatting from monocular human videos. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 20418–20431, 2024. 2, 3, 4
work page 2024
-
[21]
Capturing and inferring dense full-body human-scene contact
Chun-Hao P Huang, Hongwei Yi, Markus H ¨oschle, Matvey Safroshkin, Tsvetelina Alexiadis, Senya Polikovsky, Daniel Scharstein, and Michael J Black. Capturing and inferring dense full-body human-scene contact. In Proceedings of 9 the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13274–13285, 2022. 1
work page 2022
-
[22]
Arch: Animatable reconstruction of clothed hu- mans
Zeng Huang, Yuanlu Xu, Christoph Lassner, Hao Li, and Tony Tung. Arch: Animatable reconstruction of clothed hu- mans. In Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition , pages 3093–3102,
-
[23]
Interactive synthesis of human- object interaction
Sumit Jain and C Karen Liu. Interactive synthesis of human- object interaction. In Proceedings of the 2009 ACM SIG- GRAPH/Eurographics Symposium on Computer Animation, pages 47–53, 2009. 3
work page 2009
-
[24]
Flexnerf: Photorealistic free- viewpoint rendering of moving humans from sparse views
Vinoj Jayasundara, Amit Agrawal, Nicolas Heron, Abhinav Shrivastava, and Larry S Davis. Flexnerf: Photorealistic free- viewpoint rendering of moving humans from sparse views. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21118–21127, 2023. 2
work page 2023
-
[25]
Neuralhofu- sion: Neural volumetric rendering under human-object in- teractions
Yuheng Jiang, Suyi Jiang, Guoxing Sun, Zhuo Su, Kai- wen Guo, Minye Wu, Jingyi Yu, and Lan Xu. Neuralhofu- sion: Neural volumetric rendering under human-object in- teractions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 6155– 6165, 2022. 2
work page 2022
-
[26]
Yuheng Jiang, Kaixin Yao, Zhuo Su, Zhehao Shen, Haimin Luo, and Lan Xu. Instant-nvr: Instant neural volumetric ren- dering for human-object interactions from monocular rgbd stream. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 595–605,
-
[27]
End-to-end recovery of human shape and pose
Angjoo Kanazawa, Michael J Black, David W Jacobs, and Jitendra Malik. End-to-end recovery of human shape and pose. In Proceedings of the IEEE conference on computer vision and pattern recognition , pages 7122–7131, 2018. 4, 14
work page 2018
-
[28]
3d gaussian splatting for real-time radiance field rendering
Bernhard Kerbl, Georgios Kopanas, Thomas Leimk ¨uhler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering. ACM Trans. Graph., 42(4):139–1,
-
[29]
Muhammed Kocabas, Jen-Hao Rick Chang, James Gabriel, Oncel Tuzel, and Anurag Ranjan. Hugs: Human gaussian splats. In Proceedings of the IEEE/CVF conference on com- puter vision and pattern recognition , pages 505–515, 2024. 2, 3, 4
work page 2024
-
[30]
Human action recognition and predic- tion: A survey
Yu Kong and Yun Fu. Human action recognition and predic- tion: A survey. International Journal of Computer Vision , 130(5):1366–1401, 2022. 1
work page 2022
-
[31]
Gen- eralizable human gaussians for sparse view synthesis
Youngjoong Kwon, Baole Fang, Yixing Lu, Haoye Dong, Cheng Zhang, Francisco Vicente Carrasco, Albert Mosella- Montoro, Jianjin Xu, Shingo Takagi, Daeil Kim, et al. Gen- eralizable human gaussians for sparse view synthesis. In European Conference on Computer Vision, pages 451–468. Springer, 2025. 3
work page 2025
-
[32]
Gp- nerf: Generalized perception nerf for context-aware 3d scene understanding
Hao Li, Dingwen Zhang, Yalun Dai, Nian Liu, Lechao Cheng, Jingfeng Li, Jingdong Wang, and Junwei Han. Gp- nerf: Generalized perception nerf for context-aware 3d scene understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21708– 21718, 2024. 3
work page 2024
-
[33]
Task-oriented human-object interactions generation with implicit neural representations
Quanzhou Li, Jingbo Wang, Chen Change Loy, and Bo Dai. Task-oriented human-object interactions generation with implicit neural representations. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3035–3044, 2024. 1
work page 2024
-
[34]
Para- metric model-based 3d human shape and pose estimation from multiple views
Zhongguo Li, Anders Heyden, and Magnus Oskarsson. Para- metric model-based 3d human shape and pose estimation from multiple views. In Image Analysis: 21st Scandinavian Conference, SCIA 2019, Norrk ¨oping, Sweden, June 11–13, 2019, Proceedings 21, pages 336–347. Springer, 2019. 4
work page 2019
-
[35]
Learning implicit templates for point-based clothed human modeling
Siyou Lin, Hongwen Zhang, Zerong Zheng, Ruizhi Shao, and Yebin Liu. Learning implicit templates for point-based clothed human modeling. In European Conference on Com- puter Vision, pages 210–228. Springer, 2022. 4
work page 2022
-
[36]
Hosnerf: Dynamic human-object-scene neural ra- diance fields from a single video
Jia-Wei Liu, Yan-Pei Cao, Tianyuan Yang, Zhongcong Xu, Jussi Keppo, Ying Shan, Xiaohu Qie, and Mike Zheng Shou. Hosnerf: Dynamic human-object-scene neural ra- diance fields from a single video. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 18483–18494, 2023. 2, 4
work page 2023
-
[37]
Humangaus- sian: Text-driven 3d human generation with gaussian splat- ting
Xian Liu, Xiaohang Zhan, Jiaxiang Tang, Ying Shan, Gang Zeng, Dahua Lin, Xihui Liu, and Ziwei Liu. Humangaus- sian: Text-driven 3d human generation with gaussian splat- ting. In Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition , pages 6646–6657,
-
[38]
Revisit human-scene interaction via space occu- pancy
Xinpeng Liu, Haowen Hou, Yanchao Yang, Yong-Lu Li, and Cewu Lu. Revisit human-scene interaction via space occu- pancy. In European Conference on Computer Vision, pages 1–19. Springer, 2025. 1
work page 2025
-
[39]
Neural rays for occlusion-aware image-based render- ing
Yuan Liu, Sida Peng, Lingjie Liu, Qianqian Wang, Peng Wang, Christian Theobalt, Xiaowei Zhou, and Wenping Wang. Neural rays for occlusion-aware image-based render- ing. In Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition , pages 7824–7833,
-
[40]
Citygaussian: Real-time high-quality large-scale scene rendering with gaussians
Yang Liu, Chuanchen Luo, Lue Fan, Naiyan Wang, Jun- ran Peng, and Zhaoxiang Zhang. Citygaussian: Real-time high-quality large-scale scene rendering with gaussians. In European Conference on Computer Vision, pages 265–282. Springer, 2025. 3
work page 2025
-
[41]
Smpl: a skinned multi- person linear model
Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J Black. Smpl: a skinned multi- person linear model. ACM Transactions on Graphics (TOG), 34(6):1–16, 2015. 13
work page 2015
-
[42]
Splatfields: Neural gaussian splats for sparse 3d and 4d re- construction
Marko Mihajlovic, Sergey Prokudin, Siyu Tang, Robert Maier, Federica Bogo, Tony Tung, and Edmond Boyer. Splatfields: Neural gaussian splats for sparse 3d and 4d re- construction. In European Conference on Computer Vision, pages 313–332. Springer, 2025. 2
work page 2025
-
[43]
Nerf: Representing scenes as neural radiance fields for view syn- thesis
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view syn- thesis. Communications of the ACM , 65(1):99–106, 2021. 3
work page 2021
-
[44]
Human gaussian 10 splatting: Real-time rendering of animatable avatars
Arthur Moreau, Jifei Song, Helisa Dhamo, Richard Shaw, Yiren Zhou, and Eduardo P ´erez-Pellitero. Human gaussian 10 splatting: Real-time rendering of animatable avatars. In Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 788–798, 2024. 3, 4
work page 2024
-
[45]
Instant neural graphics primitives with a mul- tiresolution hash encoding
Thomas M ¨uller, Alex Evans, Christoph Schied, and Alexan- der Keller. Instant neural graphics primitives with a mul- tiresolution hash encoding. ACM transactions on graphics (TOG), 41(4):1–15, 2022. 3
work page 2022
-
[46]
Coherentgs: Sparse novel view synthesis with coherent 3d gaussians
Avinash Paliwal, Wei Ye, Jinhui Xiong, Dmytro Kotovenko, Rakesh Ranjan, Vikas Chandra, and Nima Khademi Kalan- tari. Coherentgs: Sparse novel view synthesis with coherent 3d gaussians. In European Conference on Computer Vision, pages 19–37. Springer, 2025. 2
work page 2025
-
[47]
Sparse multi-view hand-object reconstruction for unseen environ- ments
Yik Lung Pang, Changjae Oh, and Andrea Cavallaro. Sparse multi-view hand-object reconstruction for unseen environ- ments. In Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 803–810, 2024. 1
work page 2024
-
[48]
Ani- matable neural radiance fields for modeling dynamic human bodies
Sida Peng, Junting Dong, Qianqian Wang, Shangzhan Zhang, Qing Shuai, Xiaowei Zhou, and Hujun Bao. Ani- matable neural radiance fields for modeling dynamic human bodies. In Proceedings of the IEEE/CVF International Con- ference on Computer Vision, pages 14314–14323, 2021. 4, 14
work page 2021
-
[49]
Gendr: A generalized differentiable ren- derer
Felix Petersen, Bastian Goldluecke, Christian Borgelt, and Oliver Deussen. Gendr: A generalized differentiable ren- derer. In Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition , pages 4002–4011,
-
[50]
Manus: Markerless grasp capture using articulated 3d gaussians
Chandradeep Pokhariya, Ishaan Nikhil Shah, Angela Xing, Zekun Li, Kefan Chen, Avinash Sharma, and Srinath Srid- har. Manus: Markerless grasp capture using articulated 3d gaussians. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 2197– 2208, 2024. 2, 6, 7, 14
work page 2024
-
[51]
3dgs-avatar: Animatable avatars via deformable 3d gaussian splatting
Zhiyin Qian, Shaofei Wang, Marko Mihajlovic, Andreas Geiger, and Siyu Tang. 3dgs-avatar: Animatable avatars via deformable 3d gaussian splatting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5020–5030, 2024. 3
work page 2024
-
[52]
Web ar: A promising future for mobile augmented reality—state of the art, chal- lenges, and insights
Xiuquan Qiao, Pei Ren, Schahram Dustdar, Ling Liu, Huadong Ma, and Junliang Chen. Web ar: A promising future for mobile augmented reality—state of the art, chal- lenges, and insights. Proceedings of the IEEE, 107(4):651– 666, 2019. 2
work page 2019
-
[53]
Em- bodied hands: modeling and capturing hands and bodies to- gether
Javier Romero, Dimitrios Tzionas, and Michael J Black. Em- bodied hands: modeling and capturing hands and bodies to- gether. ACM Transactions on Graphics (TOG), 36(6):1–17,
-
[54]
Em- bodied hands: Modeling and capturing hands and bodies to- gether
Javier Romero, Dimitrios Tzionas, and Michael J Black. Em- bodied hands: Modeling and capturing hands and bodies to- gether. arXiv preprint arXiv:2201.02610, 2022. 4
-
[55]
Image quality assessment through fsim, ssim, mse and psnr—a comparative study
Umme Sara, Morium Akter, and Mohammad Shorif Ud- din. Image quality assessment through fsim, ssim, mse and psnr—a comparative study. Journal of Computer and Com- munications, 7(3):8–18, 2019. 14
work page 2019
-
[56]
Structure- from-motion revisited
Johannes L Schonberger and Jan-Michael Frahm. Structure- from-motion revisited. In Proceedings of the IEEE con- ference on computer vision and pattern recognition , pages 4104–4113, 2016. 2
work page 2016
-
[57]
Swings: sliding windows for dynamic 3d gaussian splatting
Richard Shaw, Michal Nazarczuk, Jifei Song, Arthur Moreau, Sibi Catley-Chandar, Helisa Dhamo, and Eduardo P´erez-Pellitero. Swings: sliding windows for dynamic 3d gaussian splatting. In European Conference on Computer Vision, pages 37–54. Springer, 2025. 2
work page 2025
-
[58]
Holo- ported characters: Real-time free-viewpoint rendering of humans from sparse rgb cameras
Ashwath Shetty, Marc Habermann, Guoxing Sun, Diogo Lu- vizon, Vladislav Golyanik, and Christian Theobalt. Holo- ported characters: Real-time free-viewpoint rendering of humans from sparse rgb cameras. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1206–1215, 2024. 2
work page 2024
-
[59]
Review of image-based rendering techniques
Harry Shum and Sing Bing Kang. Review of image-based rendering techniques. Visual Communications and Image Processing 2000, 4067:2–13, 2000. 2
work page 2000
-
[60]
Free viewpoint video extraction, representation, coding, and rendering
Aljoscha Smolic, Karsten Mueller, Philipp Merkle, Tobias Rein, Matthias Kautzner, Peter Eisert, and Thomas Wiegand. Free viewpoint video extraction, representation, coding, and rendering. In 2004 International Conference on Image Pro- cessing, 2004. ICIP’04., pages 3287–3290. IEEE, 2004. 2
work page 2004
-
[61]
Npc: Neural point characters from video
Shih-Yang Su, Timur Bagautdinov, and Helge Rhodin. Npc: Neural point characters from video. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 14795–14805, 2023. 2
work page 2023
-
[62]
Neural free-viewpoint performance rendering under complex human-object interactions
Guoxing Sun, Xin Chen, Yizhang Chen, Anqi Pang, Pei Lin, Yuheng Jiang, Lan Xu, Jingyi Yu, and Jingya Wang. Neural free-viewpoint performance rendering under complex human-object interactions. In Proceedings of the 29th ACM International Conference on Multimedia, pages 4651–4660,
-
[63]
Neuralhumanfvv: Real-time neural volumetric human performance rendering using rgb cameras
Xin Suo, Yuheng Jiang, Pei Lin, Yingliang Zhang, Minye Wu, Kaiwen Guo, and Lan Xu. Neuralhumanfvv: Real-time neural volumetric human performance rendering using rgb cameras. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6226–6237,
-
[64]
Grab: A dataset of whole-body human grasp- ing of objects
Omid Taheri, Nima Ghorbani, Michael J Black, and Dim- itrios Tzionas. Grab: A dataset of whole-body human grasp- ing of objects. In Computer Vision–ECCV 2020: 16th Eu- ropean Conference, Glasgow, UK, August 23–28, 2020, Pro- ceedings, Part IV 16, pages 581–600. Springer, 2020. 5
work page 2020
-
[65]
Neurad: Neural rendering for autonomous driving
Adam Tonderski, Carl Lindstr ¨om, Georg Hess, William Ljungbergh, Lennart Svensson, and Christoffer Petersson. Neurad: Neural rendering for autonomous driving. In Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14895–14904, 2024. 1
work page 2024
-
[66]
Deco: Dense estimation of 3d human-scene contact in the wild
Shashank Tripathi, Agniv Chatterjee, Jean-Claude Passy, Hongwei Yi, Dimitrios Tzionas, and Michael J Black. Deco: Dense estimation of 3d human-scene contact in the wild. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8001–8013, 2023. 5
work page 2023
-
[67]
Ibr- net: Learning multi-view image-based rendering
Qianqian Wang, Zhicheng Wang, Kyle Genova, Pratul P Srinivasan, Howard Zhou, Jonathan T Barron, Ricardo Martin-Brualla, Noah Snavely, and Thomas Funkhouser. Ibr- net: Learning multi-view image-based rendering. In Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4690–4699, 2021. 7, 14 11
work page 2021
-
[68]
Image quality assessment: from error visibility to structural similarity
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Si- moncelli. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004. 6, 14
work page 2004
-
[69]
Hu- mannerf: Free-viewpoint rendering of moving people from monocular video
Chung-Yi Weng, Brian Curless, Pratul P Srinivasan, Jonathan T Barron, and Ira Kemelmacher-Shlizerman. Hu- mannerf: Free-viewpoint rendering of moving people from monocular video. In Proceedings of the IEEE/CVF con- ference on computer vision and pattern Recognition , pages 16210–16220, 2022. 2
work page 2022
-
[70]
Differentiable render- ing of parametric geometry
Markus Worchel and Marc Alexa. Differentiable render- ing of parametric geometry. ACM Transactions on Graphics (TOG), 42(6):1–18, 2023. 3
work page 2023
-
[71]
Object- compositional neural implicit surfaces
Qianyi Wu, Xian Liu, Yuedong Chen, Kejie Li, Chuanxia Zheng, Jianfei Cai, and Jianmin Zheng. Object- compositional neural implicit surfaces. In European Con- ference on Computer Vision, pages 197–213. Springer, 2022. 3
work page 2022
-
[72]
Space-time neural irradiance fields for free-viewpoint video
Wenqi Xian, Jia-Bin Huang, Johannes Kopf, and Changil Kim. Space-time neural irradiance fields for free-viewpoint video. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9421–9431,
-
[73]
Relightable and animatable neural avatar from sparse-view video
Zhen Xu, Sida Peng, Chen Geng, Linzhan Mou, Zihan Yan, Jiaming Sun, Hujun Bao, and Xiaowei Zhou. Relightable and animatable neural avatar from sparse-view video. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 990–1000, 2024. 3
work page 2024
-
[74]
Bangbang Yang, Yinda Zhang, Yijin Li, Zhaopeng Cui, Sean Fanello, Hujun Bao, and Guofeng Zhang. Neural render- ing in a room: amodal 3d understanding and free-viewpoint rendering for the closed scene composed of pre-captured ob- jects. ACM Transactions on Graphics (TOG) , 41(4):1–10,
-
[75]
Neural- dome: A neural modeling pipeline on multi-view human- object interactions
Juze Zhang, Haimin Luo, Hongdi Yang, Xinru Xu, Qianyang Wu, Ye Shi, Jingyi Yu, Lan Xu, and Jingya Wang. Neural- dome: A neural modeling pipeline on multi-view human- object interactions. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 8834–8845, 2023. 2, 5, 6, 7, 14
work page 2023
-
[76]
Hoi-mˆ 3: Capture multiple humans and objects in- teraction within contextual environment
Juze Zhang, Jingyan Zhang, Zining Song, Zhanhe Shi, Chengfeng Zhao, Ye Shi, Jingyi Yu, Lan Xu, and Jingya Wang. Hoi-mˆ 3: Capture multiple humans and objects in- teraction within contextual environment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 516–526, 2024. 3
work page 2024
-
[77]
Cor-gs: sparse-view 3d gaussian splatting via co-regularization
Jiawei Zhang, Jiahe Li, Xiaohan Yu, Lei Huang, Lin Gu, Jin Zheng, and Xiao Bai. Cor-gs: sparse-view 3d gaussian splatting via co-regularization. In European Conference on Computer Vision, pages 335–352. Springer, 2025. 2, 3
work page 2025
-
[78]
The unreasonable effectiveness of deep features as a perceptual metric
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shecht- man, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recogni- tion, pages 586–595, 2018. 6, 14
work page 2018
-
[79]
I’m hoi: Inertia-aware monocular capture of 3d human-object interac- tions
Chengfeng Zhao, Juze Zhang, Jiashen Du, Ziwei Shan, Junye Wang, Jingyi Yu, Jingya Wang, and Lan Xu. I’m hoi: Inertia-aware monocular capture of 3d human-object interac- tions. In Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 729–741, 2024. 1
work page 2024
-
[80]
In-place scene labelling and understanding with implicit scene representation
Shuaifeng Zhi, Tristan Laidlow, Stefan Leutenegger, and An- drew J Davison. In-place scene labelling and understanding with implicit scene representation. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 15838–15847, 2021. 3
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.