RoDyGS: Robust Dynamic Gaussian Splatting for Casual Videos
Pith reviewed 2026-05-23 07:56 UTC · model grok-4.3
The pith
RoDyGS reconstructs dynamic 3D scenes from casual monocular videos by separating static and dynamic elements with spatiotemporal regularization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
RoDyGS explicitly separates static and dynamic scene elements, and applies spatiotemporal regularization to enforce physically plausible geometry and temporally consistent motion, significantly outperforming previous pose-free dynamic novel view synthesis approaches.
What carries the argument
Explicit separation of static and dynamic scene elements combined with spatiotemporal regularization applied to a Gaussian splatting representation.
If this is right
- Dynamic novel view synthesis becomes feasible from single casual videos without known camera poses.
- Rendered outputs maintain temporally consistent motion for moving scene elements.
- Geometry in dynamic regions satisfies physical plausibility constraints enforced by regularization.
- The method competes in quality with static reconstruction techniques while handling motion.
Where Pith is reading between the lines
- The separation step could simplify downstream tasks such as object tracking or background removal in video processing pipelines.
- Regularization patterns developed here might transfer to other monocular reconstruction settings that face similar static-dynamic ambiguities.
- Testing on videos with rapid camera motion or long durations would reveal whether the regularization remains stable beyond the reported cases.
Load-bearing premise
That explicit separation of static and dynamic elements combined with spatiotemporal regularization will reliably resolve the inherent ambiguity in monocular dynamic reconstruction without additional constraints or multi-view data.
What would settle it
A monocular video sequence with complex object interactions or partial occlusions where the separation produces inaccurate 3D geometry or temporally inconsistent motion across frames.
Figures
read the original abstract
4D reconstruction from casually captured monocular videos is challenging due to inherent ambiguity in reconstructing dynamic 3D geometry. To address this challenge, we introduce Robust Dynamic Gaussian Splatting (RoDyGS), a method that reconstructs dynamic scene representation from casual monocular videos. RoDyGS explicitly separates static and dynamic scene elements, and applies spatiotemporal regularization to enforce physically plausible geometry and temporally consistent motion. Furthermore, we propose a comprehensive benchmark, Kubric-MRig, which provides extensive camera and object motion along with simultaneous multi-view capture, features that are absent in previous benchmarks. Experiments demonstrate that RoDyGS significantly outperforms previous pose-free dynamic novel view synthesis approaches and achieves competitive rendering quality compared to existing pose-free static novel view synthesis approaches. Our proejct page is available at https://rodygs.github.io
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces RoDyGS, a method for 4D reconstruction of dynamic scenes from casually captured monocular videos. It explicitly separates static and dynamic scene elements and applies spatiotemporal regularization to enforce physically plausible geometry and temporally consistent motion. The work also proposes the Kubric-MRig benchmark, which features extensive camera and object motion with simultaneous multi-view capture. Experiments are claimed to show that RoDyGS significantly outperforms prior pose-free dynamic novel view synthesis methods while achieving competitive rendering quality with pose-free static approaches.
Significance. If the central claims hold with supporting quantitative evidence, the approach would offer a practical advance in monocular dynamic reconstruction by addressing inherent ambiguities through explicit decomposition and regularization. The introduction of Kubric-MRig as a benchmark with multi-view ground truth addresses a noted gap in prior datasets and could facilitate more rigorous evaluation of pose-free dynamic methods.
major comments (1)
- [Abstract] Abstract: The abstract asserts that RoDyGS 'significantly outperforms previous pose-free dynamic novel view synthesis approaches' and achieves 'competitive rendering quality,' yet provides no quantitative results, error metrics, ablation studies, or method details to support these claims. This absence makes it impossible to assess whether the explicit static/dynamic separation and spatiotemporal regularization actually resolve the monocular ambiguities as stated.
minor comments (1)
- [Abstract] Abstract: Typo in 'proejct page' should be corrected to 'project page'.
Simulated Author's Rebuttal
We thank the referee for their review and the recommendation for major revision. We address the single major comment below regarding the abstract. We will revise the manuscript accordingly to strengthen the presentation of our claims.
read point-by-point responses
-
Referee: [Abstract] Abstract: The abstract asserts that RoDyGS 'significantly outperforms previous pose-free dynamic novel view synthesis approaches' and achieves 'competitive rendering quality,' yet provides no quantitative results, error metrics, ablation studies, or method details to support these claims. This absence makes it impossible to assess whether the explicit static/dynamic separation and spatiotemporal regularization actually resolve the monocular ambiguities as stated.
Authors: We agree that the abstract, being a high-level summary, does not include specific quantitative metrics. The supporting results, including PSNR/SSIM comparisons on Kubric-MRig and other benchmarks, ablation studies on the static/dynamic decomposition and spatiotemporal regularization, and method details, are presented in Sections 4 and 5 with Tables 1-3 and Figures 3-7. To address the concern and make the claims more self-contained, we will revise the abstract to include key quantitative highlights (e.g., average PSNR gains over prior pose-free dynamic methods) while maintaining its concise nature. revision: yes
Circularity Check
No significant circularity; method and experiments are self-contained
full rationale
The paper introduces an empirical method (RoDyGS) for dynamic scene reconstruction via explicit static/dynamic separation and spatiotemporal regularization, evaluated on a new benchmark (Kubric-MRig) and compared to prior approaches. No derivation chain, equations, or first-principles predictions are present in the provided text that could reduce to fitted inputs or self-citations by construction. Claims rest on proposed architecture and experimental outcomes rather than any self-definitional or load-bearing self-referential steps, making the work self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Nonrigid structure from motion in trajectory space
Ijaz Akhter, Yaser Sheikh, Sohaib Khan, and Takeo Kanade. Nonrigid structure from motion in trajectory space. Ad- vances in neural information processing systems , 21, 2008. 5
work page 2008
-
[2]
Nope-nerf: Optimising neu- ral radiance field with no pose prior
Wenjing Bian, Zirui Wang, Kejie Li, Jia-Wang Bian, and Victor Adrian Prisacariu. Nope-nerf: Optimising neu- ral radiance field with no pose prior. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4160–4169, 2023. 3, 7, 16
work page 2023
-
[3]
D. J. Butler, J. Wulff, G. B. Stanley, and M. J. Black. A naturalistic open source movie for optical flow evaluation. In European Conf. on Computer Vision (ECCV), pages 611–
-
[4]
Springer-Verlag, 2012. 6
work page 2012
-
[5]
Hexplane: A fast representa- tion for dynamic scenes
Ang Cao and Justin Johnson. Hexplane: A fast representa- tion for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 130–141, 2023. 2
work page 2023
-
[6]
Gaussianeditor: Swift and control- lable 3d editing with gaussian splatting
Yiwen Chen, Zilong Chen, Chi Zhang, Feng Wang, Xi- aofeng Yang, Yikai Wang, Zhongang Cai, Lei Yang, Huaping Liu, and Guosheng Lin. Gaussianeditor: Swift and control- lable 3d editing with gaussian splatting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21476–21485, 2024. 2
work page 2024
-
[7]
Gaussian activated neural radiance fields for high fidelity reconstruction and pose estimation
Shin-Fang Chng, Sameera Ramasinghe, Jamie Sherrah, and Simon Lucey. Gaussian activated neural radiance fields for high fidelity reconstruction and pose estimation. In Eu- ropean Conference on Computer Vision , pages 264–280. Springer, 2022. 3
work page 2022
-
[8]
Cosseggaussians: Compact and swift scene segmenting 3d gaussians with dual feature fusion
Bin Dou, Tianyu Zhang, Yongjia Ma, Zhaohui Wang, and Zejian Yuan. Cosseggaussians: Compact and swift scene segmenting 3d gaussians with dual feature fusion. CoRR,
-
[9]
Google scanned objects: A high- quality dataset of 3d scanned household items
Laura Downs, Anthony Francis, Nate Koenig, Brandon Kin- man, Ryan Hickman, Krista Reymann, Thomas B McHugh, and Vincent Vanhoucke. Google scanned objects: A high- quality dataset of 3d scanned household items. In 2022 In- ternational Conference on Robotics and Automation (ICRA), pages 2553–2560. IEEE, 2022. 10
work page 2022
-
[10]
InstantSplat: Sparse-view gaussian splatting in seconds.arXiv preprint arXiv:2403.20309, 2024
Zhiwen Fan, Wenyan Cong, Kairun Wen, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, et al. Instantsplat: Un- bounded sparse-view pose-free gaussian splatting in 40 sec- onds. arXiv preprint arXiv:2403.20309, 2024. 1, 3
-
[11]
Fast dynamic radiance fields with time-aware neural voxels
Jiemin Fang, Taoran Yi, Xinggang Wang, Lingxi Xie, Xi- aopeng Zhang, Wenyu Liu, Matthias Nießner, and Qi Tian. Fast dynamic radiance fields with time-aware neural voxels. In SIGGRAPH Asia 2022 Conference Papers, 2022. 13
work page 2022
-
[12]
Plenoxels: Radiance fields without neural networks
Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5501–5510, 2022. 2
work page 2022
-
[13]
K-planes: Explicit radiance fields in space, time, and appearance
Sara Fridovich-Keil, Giacomo Meanti, Frederik Rahbæk Warburg, Benjamin Recht, and Angjoo Kanazawa. K-planes: Explicit radiance fields in space, time, and appearance. In CVPR, 2023. 2
work page 2023
-
[14]
Yang Fu, Sifei Liu, Amey Kulkarni, Jan Kautz, Alexei A. Efros, and Xiaolong Wang. Colmap-free 3d gaussian splat- ting. In Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 20796– 20805, 2024. 3, 7, 10, 16
work page 2024
-
[15]
Dynamic view synthesis from dynamic monocular video
Chen Gao, Ayush Saraf, Johannes Kopf, and Jia-Bin Huang. Dynamic view synthesis from dynamic monocular video. In Proceedings of the IEEE International Conference on Com- puter Vision, 2021. 13
work page 2021
-
[16]
Monocular dynamic view synthesis: A reality check
Hang Gao, Ruilong Li, Shubham Tulsiani, Bryan Russell, and Angjoo Kanazawa. Monocular dynamic view synthesis: A reality check. Advances in Neural Information Processing Systems, 35:33768–33780, 2022. 2, 6, 7, 10, 13
work page 2022
-
[17]
Klaus Greff, Francois Belletti, Lucas Beyer, Carl Doersch, Yilun Du, Daniel Duckworth, David J Fleet, Dan Gnanapra- gasam, Florian Golemo, Charles Herrmann, Thomas Kipf, Abhijit Kundu, Dmitry Lagun, Issam Laradji, Hsueh- Ti (Derek) Liu, Henning Meyer, Yishu Miao, Derek Nowrouzezahrai, Cengiz Oztireli, Etienne Pot, Noha Rad- wan, Daniel Rebain, Sara Sabour...
work page 2022
-
[18]
Compressive sensing with un-trained neural networks: Gradient descent finds a smooth approximation
Reinhard Heckel and Mahdi Soltanolkotabi. Compressive sensing with un-trained neural networks: Gradient descent finds a smooth approximation. In International Conference on Machine Learning, pages 4149–4158. PMLR, 2020. 5
work page 2020
-
[19]
Baking neural ra- diance fields for real-time view synthesis
Peter Hedman, Pratul P Srinivasan, Ben Mildenhall, Jonathan T Barron, and Paul Debevec. Baking neural ra- diance fields for real-time view synthesis. In Proceedings of the IEEE/CVF international conference on computer vision, pages 5875–5884, 2021. 2
work page 2021
-
[20]
Derek Hoiem, Alexei A Efros, and Martial Hebert. Au- tomatic photo pop-up. In ACM SIGGRAPH 2005 Papers , pages 577–584. 2005. 2
work page 2005
-
[21]
Sc-gs: Sparse-controlled gaussian splatting for editable dynamic scenes
Yi-Hua Huang, Yang-Tian Sun, Ziyi Yang, Xiaoyang Lyu, Yan-Pei Cao, and Xiaojuan Qi. Sc-gs: Sparse-controlled gaussian splatting for editable dynamic scenes. In Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4220–4230, 2024. 2
work page 2024
-
[22]
Self-calibrating neural radiance fields
Yoonwoo Jeong, Seokjun Ahn, Christopher Choy, Anima Anandkumar, Minsu Cho, and Jaesik Park. Self-calibrating neural radiance fields. In Proceedings of the IEEE/CVF In- ternational Conference on Computer Vision , pages 5846– 5854, 2021. 3, 7
work page 2021
-
[23]
Perfception: Perception using radiance fields
Yoonwoo Jeong, Seungjoo Shin, Junha Lee, Chris Choy, An- ima Anandkumar, Minsu Cho, and Jaesik Park. Perfception: Perception using radiance fields. Advances in Neural Infor- mation Processing Systems, 35:26105–26121, 2022. 12
work page 2022
-
[24]
Co- tracker: It is better to track together
Nikita Karaev, Ignacio Rocco, Benjamin Graham, Natalia Neverova, Andrea Vedaldi, and Christian Rupprecht. Co- tracker: It is better to track together. arXiv preprint arXiv:2307.07635, 2023. 2
-
[25]
3d gaussian splatting for real-time radiance field rendering
Bernhard Kerbl, Georgios Kopanas, Thomas Leimk ¨uhler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering. ACM Trans. Graph., 42(4):139–1,
-
[26]
3d gaussian splatting as markov chain monte carlo
Shakiba Kheradmand, Daniel Rebain, Gopal Sharma, Wei- wei Sun, Jeff Tseng, Hossam Isack, Abhishek Kar, An- drea Tagliasacchi, and Kwang Moo Yi. 3d gaussian splatting as markov chain monte carlo. arXiv preprint arXiv:2404.09591, 2024. 2
-
[27]
Laplacianfusion: Detailed 3d clothed- human body reconstruction
Hyomin Kim, Hyeonseo Nam, Jungeon Kim, Jaesik Park, and Seungyong Lee. Laplacianfusion: Detailed 3d clothed- human body reconstruction. ACM Transactions on Graphics (TOG), 41(6):1–14, 2022. 5
work page 2022
-
[28]
Tanks and temples: Benchmarking large-scale scene reconstruction
Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, and Vladlen Koltun. Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics (ToG) , 36 (4):1–13, 2017. 2, 6, 7, 10
work page 2017
-
[29]
Point-based neural rendering with per- view optimization
Georgios Kopanas, Julien Philip, Thomas Leimk ¨uhler, and George Drettakis. Point-based neural rendering with per- view optimization. In Computer Graphics Forum, pages 29–
-
[30]
Wiley Online Library, 2021. 2
work page 2021
-
[31]
Dynmf: Neural motion factorization for real-time dynamic view synthesis with 3d gaussian splatting
Agelos Kratimenos, Jiahui Lei, and Kostas Daniilidis. Dynmf: Neural motion factorization for real-time dynamic view synthesis with 3d gaussian splatting. arXiV, 2023. 2, 3, 5, 7, 8, 10
work page 2023
-
[32]
Multi- body non-rigid structure-from-motion
Suryansh Kumar, Yuchao Dai, and Hongdong Li. Multi- body non-rigid structure-from-motion. In 2016 Fourth In- ternational Conference on 3D Vision (3DV), pages 148–156. IEEE, 2016. 5
work page 2016
-
[33]
Fast view synthesis of casual videos
Yao-Chih Lee, Zhoutong Zhang, Kevin Blackburn-Matzen, Simon Niklaus, Jianming Zhang, Jia-Bin Huang, and Feng Liu. Fast view synthesis of casual videos. arXiv preprint arXiv:2312.02135, 2023. 2
-
[34]
Grounding image matching in 3d with mast3r, 2024
Vincent Leroy, Yohann Cabon, and J´erˆome Revaud. Ground- ing image matching in 3d with mast3r. arXiv preprint arXiv:2406.09756, 2024. 1, 2, 4, 6, 10, 12
-
[35]
Neural 3d video synthesis from multi-view video
Tianye Li, Mira Slavcheva, Michael Zollhoefer, Simon Green, Christoph Lassner, Changil Kim, Tanner Schmidt, Steven Lovegrove, Michael Goesele, Richard Newcombe, et al. Neural 3d video synthesis from multi-view video. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 5521–5531, 2022. 2
work page 2022
-
[36]
Barf: Bundle-adjusting neural radiance fields
Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Si- mon Lucey. Barf: Bundle-adjusting neural radiance fields. In Proceedings of the IEEE/CVF international conference on computer vision, pages 5741–5751, 2021. 3, 7, 16
work page 2021
-
[37]
Robust dynamic radiance fields
Yu-Lun Liu, Chen Gao, Andreas Meuleman, Hung-Yu Tseng, Ayush Saraf, Changil Kim, Yung-Yu Chuang, Jo- hannes Kopf, and Jia-Bin Huang. Robust dynamic radiance fields. In Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 13–23, 2023. 2, 3, 4, 6, 7, 11, 12, 13, 15, 17, 20, 21
work page 2023
-
[38]
Dynamic 3d gaussians: Tracking by per- sistent dynamic view synthesis
Jonathon Luiten, Georgios Kopanas, Bastian Leibe, and Deva Ramanan. Dynamic 3d gaussians: Tracking by per- sistent dynamic view synthesis. In 3DV, 2024. 2
work page 2024
-
[39]
Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar
Ben Mildenhall, Pratul P. Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar. Local light field fusion: Practical view syn- thesis with prescriptive sampling guidelines. ACM Transac- tions on Graphics (TOG), 2019. 13
work page 2019
-
[40]
Srinivasan, Matthew Tancik, Jonathan T
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view syn- thesis. In ECCV, 2020. 1
work page 2020
-
[41]
Nerf: Representing scenes as neural radiance fields for view syn- thesis
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view syn- thesis. Communications of the ACM , 65(1):99–106, 2021. 2
work page 2021
-
[42]
Instant neural graphics primitives with a mul- tiresolution hash encoding
Thomas M ¨uller, Alex Evans, Christoph Schied, and Alexan- der Keller. Instant neural graphics primitives with a mul- tiresolution hash encoding. ACM transactions on graphics (TOG), 41(4):1–15, 2022. 2
work page 2022
-
[43]
Nerfies: Deformable neural radiance fields
Keunhong Park, Utkarsh Sinha, Jonathan T Barron, Sofien Bouaziz, Dan B Goldman, Steven M Seitz, and Ricardo Martin-Brualla. Nerfies: Deformable neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5865–5874, 2021. 1, 2
work page 2021
-
[44]
Barron, Sofien Bouaziz, Dan B Goldman, Ricardo Martin- Brualla, and Steven M
Keunhong Park, Utkarsh Sinha, Peter Hedman, Jonathan T. Barron, Sofien Bouaziz, Dan B Goldman, Ricardo Martin- Brualla, and Steven M. Seitz. Hypernerf: A higher- dimensional representation for topologically varying neural radiance fields. ACM Trans. Graph., 40(6), 2021. 6, 13
work page 2021
-
[45]
D-NeRF: Neural Radiance Fields for Dynamic Scenes
Albert Pumarola, Enric Corona, Gerard Pons-Moll, and Francesc Moreno-Noguer. D-NeRF: Neural Radiance Fields for Dynamic Scenes. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, 2020. 1, 2, 6, 7, 15, 17, 20, 21
work page 2020
-
[46]
Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer
Ren ´e Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, and Vladlen Koltun. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE Transactions on Pattern Analysis and Ma- chine Intelligence (TPAMI), 2020. 2
work page 2020
-
[47]
SAM 2: Segment Anything in Images and Videos
Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman R¨adle, Chloe Rolland, Laura Gustafson, et al. Sam 2: Segment anything in images and videos. arXiv preprint arXiv:2408.00714, 2024. 11, 13, 19
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[48]
Gernot Riegler and Vladlen Koltun. Free view synthesis. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIX 16, pages 623–640. Springer, 2020. 2
work page 2020
-
[49]
Gernot Riegler and Vladlen Koltun. Stable view synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12216–12225, 2021. 2
work page 2021
-
[50]
The convergence rate of neural networks for learned functions of different frequencies
Basri Ronen, David Jacobs, Yoni Kasten, and Shira Kritch- man. The convergence rate of neural networks for learned functions of different frequencies. Advances in Neural In- formation Processing Systems, 32, 2019. 5
work page 2019
-
[51]
Structure- from-motion revisited
Johannes L Schonberger and Jan-Michael Frahm. Structure- from-motion revisited. In Proceedings of the IEEE con- ference on computer vision and pattern recognition , pages 4104–4113, 2016. 1, 2
work page 2016
-
[52]
Improved direct voxel grid optimization for radiance fields reconstruc- tion
Cheng Sun, Min Sun, and Hwann-Tzong Chen. Improved direct voxel grid optimization for radiance fields reconstruc- tion. arXiv preprint arXiv:2206.05085, 2022. 2
-
[53]
Raft: Recurrent all-pairs field transforms for optical flow
Zachary Teed and Jia Deng. Raft: Recurrent all-pairs field transforms for optical flow. In Computer Vision–ECCV 23 2020: 16th European Conference, Glasgow, UK, August 23– 28, 2020, Proceedings, Part II 16, pages 402–419. Springer,
work page 2020
-
[54]
Shape of motion: 4d reconstruc- tion from a single video
Qianqian Wang, Vickie Ye, Hang Gao, Jake Austin, Zhengqi Li, and Angjoo Kanazawa. Shape of motion: 4d reconstruc- tion from a single video. 2024. 2
work page 2024
-
[55]
Shape of motion: 4d reconstruc- tion from a single video
Qianqian Wang, Vickie Ye, Hang Gao, Jake Austin, Zhengqi Li, and Angjoo Kanazawa. Shape of motion: 4d reconstruc- tion from a single video. arXiv preprint arXiv:2407.13764,
-
[56]
Dust3r: Geometric 3d vi- sion made easy
Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. Dust3r: Geometric 3d vi- sion made easy. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20697– 20709, 2024. 1, 3
work page 2024
-
[57]
Gflow: Recovering 4d world from monocular video
Shizun Wang, Xingyi Yang, Qiuhong Shen, Zhenxiang Jiang, and Xinchao Wang. Gflow: Recovering 4d world from monocular video. arXiv preprint arXiv:2405.18426, 2024. 2
-
[58]
NeRF −−: Neural radiance fields without known camera parameters,
Zirui Wang, Shangzhe Wu, Weidi Xie, Min Chen, and Victor Adrian Prisacariu. Nerf–: Neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064, 2021. 3, 7
-
[59]
4d gaussian splatting for real-time dynamic scene rendering
Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, and Xinggang Wang. 4d gaussian splatting for real-time dynamic scene rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20310–20320, 2024. 2, 6, 7, 15, 17, 20, 21
work page 2024
-
[60]
Sparsegs: Real- time 360 {\deg} sparse view synthesis using gaussian splat- ting
Haolin Xiong, Sairisheek Muttukuru, Rishi Upadhyay, Pradyumna Chari, and Achuta Kadambi. Sparsegs: Real- time 360 {\deg} sparse view synthesis using gaussian splat- ting. arXiv preprint arXiv:2312.00206, 2023. 2, 6
-
[61]
Point- nerf: Point-based neural radiance fields
Qiangeng Xu, Zexiang Xu, Julien Philip, Sai Bi, Zhixin Shu, Kalyan Sunkavalli, and Ulrich Neumann. Point- nerf: Point-based neural radiance fields. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5438–5448, 2022. 2
work page 2022
-
[62]
arXiv preprint arXiv:2304.11968 (2023)
Jinyu Yang, Mingqi Gao, Zhe Li, Shang Gao, Fangjing Wang, and Feng Zheng. Track anything: Segment anything meets videos. arXiv preprint arXiv:2304.11968, 2023. 2, 3, 4, 6, 7, 10, 11, 12, 13, 19, 20, 21
-
[63]
Depth anything: Unleashing the power of large-scale unlabeled data
Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, and Hengshuang Zhao. Depth anything: Unleashing the power of large-scale unlabeled data. In CVPR, 2024. 2, 4, 6, 10
work page 2024
-
[64]
Deformable 3d gaussians for high- fidelity monocular dynamic scene reconstruction
Ziyi Yang, Xinyu Gao, Wen Zhou, Shaohui Jiao, Yuqing Zhang, and Xiaogang Jin. Deformable 3d gaussians for high- fidelity monocular dynamic scene reconstruction. In Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20331–20341, 2024. 6, 7, 15, 17, 20, 21
work page 2024
-
[65]
Real- time photorealistic dynamic scene representation and render- ing with 4d gaussian splatting
Zeyu Yang, Hongye Yang, Zijie Pan, and Li Zhang. Real- time photorealistic dynamic scene representation and render- ing with 4d gaussian splatting. In International Conference on Learning Representations (ICLR), 2024. 2, 6, 7, 15, 17, 20, 21
work page 2024
-
[66]
Absgs: Recovering fine details in 3d gaussian splat- ting
Zongxin Ye, Wenyu Li, Sidun Liu, Peng Qiao, and Yong Dou. Absgs: Recovering fine details in 3d gaussian splat- ting. In ACM Multimedia 2024, 2024. 2
work page 2024
-
[67]
inerf: Inverting neural radiance fields for pose estimation
Lin Yen-Chen, Pete Florence, Jonathan T Barron, Alberto Rodriguez, Phillip Isola, and Tsung-Yi Lin. inerf: Inverting neural radiance fields for pose estimation. In2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1323–1330. IEEE, 2021. 3
work page 2021
-
[68]
Novel view synthesis of dynamic scenes with globally coherent depths from a monocular camera
Jae Shin Yoon, Kihwan Kim, Orazio Gallo, Hyun Soo Park, and Jan Kautz. Novel view synthesis of dynamic scenes with globally coherent depths from a monocular camera. In Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5336–5345, 2020. 6, 10, 12, 13
work page 2020
-
[69]
Cor-gs: Sparse-view 3d gaussian splat- ting via co-regularization
Jiawei Zhang, Jiahe Li, Xiaohan Yu, Lei Huang, Lin Gu, Jin Zheng, and Xiao Bai. Cor-gs: Sparse-view 3d gaussian splat- ting via co-regularization. arXiv preprint arXiv:2405.12110,
-
[70]
Differentiable point-based radiance fields for efficient view synthesis
Qiang Zhang, Seung-Hwan Baek, Szymon Rusinkiewicz, and Felix Heide. Differentiable point-based radiance fields for efficient view synthesis. In SIGGRAPH Asia 2022 Con- ference Papers, pages 1–12, 2022. 2
work page 2022
-
[71]
M. Zwicker, H. Pfister, J. van Baar, and M. Gross. Ewa splatting. IEEE Transactions on Visualization and Computer Graphics, 8(3):223–238, 2002. 3 24
work page 2002
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.