Learning Efficient 4D Gaussian Representations from Monocular Videos with Flow Splatting

Jinzhao Li; Shengjun Zhang; Xin Fei; Yueqi Duan

arxiv: 2606.29976 · v1 · pith:5KKFRO7Inew · submitted 2026-06-29 · 💻 cs.CV

Learning Efficient 4D Gaussian Representations from Monocular Videos with Flow Splatting

Shengjun Zhang , Jinzhao Li , Xin Fei , Yueqi Duan This is my paper

Pith reviewed 2026-06-30 06:55 UTC · model grok-4.3

classification 💻 cs.CV

keywords 4D Gaussian splattingdynamic scene reconstructionmonocular videosoptical flowvelocity fieldnovel view synthesisflow splatting

0 comments

The pith

Flow Splatting renders optical flow from 4D Gaussian velocity fields to supervise efficient dynamic scene reconstruction from monocular videos.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes Flow Splatting to learn 4D Gaussian representations from monocular videos by constructing a velocity field from time-varying means and covariances. This field is then splatted using an extended volume rendering approach that accounts for camera motion, producing optical flow to supervise the dynamics without ground-truth data. The approach addresses long training times, low rendering speeds, and high memory use in prior 4D extensions of Gaussian splatting. A sympathetic reader would care because the method claims to deliver higher image quality with reduced training time and increased rendering speed on standard benchmarks.

Core claim

We extend 4D volumes with time varying means and covariance to represent complex dynamics. Then, we construct and approximate the velocity field naturally based on this representations. While conventional volume rendering techniques support to render color fields, we extend the volume rendering strategy to splat the velocity field by considering the influence of camera motions. This enables Flow Splatting to render optical flow from the velocity field to supervise the dynamics learning process from monocular videos.

What carries the argument

Flow Splatting: the extension of conventional splatting to render optical flow by approximating and splatting the velocity field constructed from time-varying 4D Gaussian means and covariances while accounting for camera motion.

If this is right

The model achieves better image quality than state-of-the-art methods.
Training requires less time consumption than prior 4D Gaussian extensions.
Rendering speed is higher while maintaining or improving quality.
The representation avoids high memory consumption associated with per-frame 4D volumes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The velocity-field approximation might be adapted to supply other dense signals such as depth or surface normals in monocular settings.
The same splatting extension could be applied to non-Gaussian representations that already output time-varying positions.
If the camera-motion correction holds across wide baselines, the method could support reconstruction from casually captured handheld videos without additional calibration.

Load-bearing premise

The velocity field naturally constructed from time-varying 4D Gaussian means and covariances can be accurately approximated and splatted to provide reliable optical-flow supervision without introducing artifacts or requiring ground-truth flow.

What would settle it

A direct comparison on a controlled dynamic sequence where the rendered optical flow from the approximated velocity field deviates substantially from ground-truth motion or produces worse reconstruction metrics than baselines would falsify the reliability of the supervision signal.

Figures

Figures reproduced from arXiv: 2606.29976 by Jinzhao Li, Shengjun Zhang, Xin Fei, Yueqi Duan.

**Figure 1.** Figure 1: Comparison of previous methods and ours. (a) We visualize the rendering results of color and optical flow for our baseline [63] and Flow Splatting. (b) We report PSNR and the rendering speed on the NVIDIA dataset [66] for multiple methods. The size of circle represents training time. Gaussians is not maintained as in physical world with insufficient viewpoints [16], leading to visual overfitting, performan… view at source ↗

**Figure 2.** Figure 2: Overview of our framework. We first leverage off-the-shelf models to predict optical flow, depth maps, masks and trajectories for the input monocular video. Then, we initialize the foreground and background respectively from the data-driven priors. Apart from the color field, we construct velocity field and propose the flow splatting strategy to render optical flow from velocity field. We introduce both co… view at source ↗

**Figure 3.** Figure 3: Qualitative Comparison on the Davis dataset. We present the qualitative comparison between our method and previous methods for novel view and novel time synthesis. 4.2 Main Results Quantitative and Qualitative Results. We conduct experiments for novel view synthesis on the NVIDIA dataset [66] and report the quantitative results in [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Qualitative Comparison of optical flow. We render the optical flow of all methods with flow splatting. The baseline stands for 4DGS [63] which presents the 4D volume. w/o 4D Gaussian indicates the absence of Polynomials and Fourier series. w/o velocity loss represents that the method is trained without supervision on the velocity field. w/o initialization means that the initialization is replaced by random… view at source ↗

read the original abstract

Reconstructing dynamic 3D scenes from monocular videos is challenging due to scene complexity and temporal dynamics. With the advancement of 3D Gaussian Splatting in novel view synthesis, existing methods extend 3D Gaussians to 4D domain with deformation fields, trajectories or spatiotemporal 4D volumes to model scene element deformation. However, these methods suffer from long training time, low rendering speed or high memory consumption for per-frame reconstruction of 4D volumes, without fully exploiting dense dynamic information. To address this issue, we propose Flow Splatting, which constructs the velocity field and enables the conventional splatting technique to render optical flow from the velocity field to supervise dynamics learning process from monocular videos. Specifically, we extend 4D volumes with time varying means and covariance to represent complex dynamics. Then, we construct and approximate the velocity field naturally based on this representations. While conventional volume rendering techniques support to render color fields, we extend the volume rendering strategy to splat the velocity field by considering the influence of camera motions. We conduct experiments on various benchmarks to demonstrate the efficiency and effectiveness of our method. Compared to the state-of-the-art methods, our model achieves better image quality with less time consumption and higher rendering speed.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's new move is splatting an approximated velocity field from time-varying 4D Gaussians to render optical flow for dynamics supervision, which targets training speed and memory in monocular 4D reconstruction.

read the letter

The main contribution here is extending 4D Gaussian volumes with time-varying means and covariances, then building and approximating a velocity field from those to enable splatting that produces optical flow. This flow serves as supervision for learning dynamics directly from monocular video, while accounting for camera motion in the rendering step.

This is a distinct supervision trick compared to the deformation fields or trajectory methods mentioned in the abstract. It aims squarely at the practical problems of long training times, low rendering speeds, and high memory use that plague current 4D Gaussian approaches.

The approach is straightforward in concept and the abstract reports better image quality alongside lower training time and higher rendering speed than SOTA. That combination would be useful if the results hold.

The soft spot is the velocity field approximation itself. The abstract calls the construction 'natural' but gives no derivation, error analysis, or ablation on how well it tracks under non-rigid motion or fast camera movement. If the approximation introduces bias or noise, the supervision signal weakens and the claimed efficiency gains become harder to trust. The stress-test concern about artifact-free flow is reasonable to check against the actual experiments.

This paper is for researchers already working on efficient 4D Gaussian or dynamic novel-view methods who want a concrete new supervision signal. It deserves a serious referee because the mechanism is specific enough to evaluate and the efficiency claims address a real bottleneck, even if the approximation needs closer scrutiny in review.

Referee Report

2 major / 2 minor

Summary. The paper proposes Flow Splatting, an extension of 4D Gaussian Splatting for dynamic scene reconstruction from monocular videos. It represents scenes with time-varying 4D Gaussians (means and covariances), constructs and approximates a velocity field from these, and extends volume rendering to splat the velocity field (accounting for camera motion) to render optical flow as a supervision signal for learning dynamics. Experiments on benchmarks claim superior image quality, reduced training time, and higher rendering speed versus prior SOTA methods that use deformation fields, trajectories, or full 4D volumes.

Significance. If the velocity-field approximation and camera-aware splatting prove reliable, the method could provide an efficient, monocular-only supervision mechanism that avoids external flow networks or per-frame 4D volumes, improving training speed and rendering performance for dynamic novel-view synthesis.

major comments (2)

[Method description of velocity-field construction and approximation] The central claim that the approximated velocity field yields reliable optical-flow supervision (and thereby the reported quality/time gains) lacks any derivation, error bound, or quantitative validation of approximation accuracy. This construction is load-bearing for the supervision signal yet is described only qualitatively in the abstract and method overview.
[Experiments and ablation studies] No ablation or analysis is reported on how the velocity-field splatting behaves under non-rigid motion or fast camera movement—the exact regimes where the skeptic concern predicts artifacts that would bias dynamics learning. Such controls are required to substantiate that the supervision signal remains artifact-free.

minor comments (2)

The abstract and method summary contain no equations for the velocity-field approximation or the extended splatting integral; adding these would clarify the technical contribution.
Quantitative comparisons would benefit from explicit reporting of training time, rendering FPS, and memory usage alongside PSNR/SSIM/LPIPS on the same hardware and scenes as the cited SOTA baselines.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point by point below, indicating the revisions we will incorporate.

read point-by-point responses

Referee: [Method description of velocity-field construction and approximation] The central claim that the approximated velocity field yields reliable optical-flow supervision (and thereby the reported quality/time gains) lacks any derivation, error bound, or quantitative validation of approximation accuracy. This construction is load-bearing for the supervision signal yet is described only qualitatively in the abstract and method overview.

Authors: We agree that the manuscript would benefit from a more formal treatment. The velocity field is obtained by taking the time derivative of the time-varying Gaussian means (and, for the covariance term, the appropriate Lie-algebra derivative), which directly yields an instantaneous velocity at each Gaussian center; this field is then splatted with the same alpha-blending weights used for color. In the revision we will add an explicit derivation subsection together with a first-order error bound that follows from the local linearity assumption of the Gaussian motion model. We will also report a quantitative validation: on synthetic sequences with known ground-truth flow we measure the L2 discrepancy between the splatted flow and the analytic flow, confirming that the approximation error remains below 0.5 pixels on average under the motion regimes present in our benchmarks. revision: yes
Referee: [Experiments and ablation studies] No ablation or analysis is reported on how the velocity-field splatting behaves under non-rigid motion or fast camera movement—the exact regimes where the skeptic concern predicts artifacts that would bias dynamics learning. Such controls are required to substantiate that the supervision signal remains artifact-free.

Authors: The main experiments already include scenes with pronounced non-rigid deformation and moderate-to-fast camera motion (e.g., the D-NeRF and HyperNeRF sequences). Nevertheless, we concur that isolating these factors would strengthen the claims. In the revised manuscript we will add two controlled ablations: (1) synthetic non-rigid sequences with increasing deformation magnitude, reporting PSNR, flow error, and training time; (2) real sequences with artificially accelerated camera trajectories, measuring the same metrics. These results will be presented in a new subsection of the experiments. revision: yes

Circularity Check

0 steps flagged

No load-bearing circularity; supervision signal treated as external to fitted parameters.

full rationale

The paper extends 4D Gaussian volumes with time-varying means/covariances, constructs an approximated velocity field from them, and extends splatting to render optical flow for supervising dynamics. This construction does not reduce any claimed prediction or result to a fitted input by definition, nor does it rely on a self-citation chain for uniqueness or ansatz. The central efficiency/quality claims rest on the method's ability to use the rendered flow as supervision from monocular video, which the text presents as an independent signal rather than a tautological renaming or self-fit. Minor self-citations (standard for Gaussian splatting extensions) are not load-bearing for the derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; the method implicitly relies on standard volume-rendering assumptions and the claim that velocity can be derived directly from time-varying Gaussian parameters.

axioms (1)

domain assumption Conventional volume rendering can be extended to splat velocity fields while accounting for camera motion.
Stated directly in the abstract as the basis for rendering optical flow.

pith-pipeline@v0.9.1-grok · 5756 in / 1144 out tokens · 32052 ms · 2026-06-30T06:55:55.250149+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

70 extracted references · 19 canonical work pages · 1 internal anchor

[1]

Vision-only robot navigation in a neural radiance world.IEEE Robotics and Automation Letters, 7(2):4606–4613, 2022

Michal Adamkiewicz, Timothy Chen, Adam Caccavale, Rachel Gardner, Preston Culbertson, Jeannette Bohg, and Mac Schwager. Vision-only robot navigation in a neural radiance world.IEEE Robotics and Automation Letters, 7(2):4606–4613, 2022

2022
[2]

Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields

Jonathan T Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul P Srinivasan. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. InICCV, pages 5855–5864, 2021

2021
[3]

Hexplane: A fast representation for dynamic scenes

Ang Cao and Justin Johnson. Hexplane: A fast representation for dynamic scenes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 130–141, 2023

2023
[4]

pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction.arXiv preprint arXiv:2312.12337, 2023

David Charatan, Sizhe Li, Andrea Tagliasacchi, and Vincent Sitzmann. pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction.arXiv preprint arXiv:2312.12337, 2023

work page arXiv 2023
[5]

Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images.arXiv preprint arXiv:2403.14627, 2024

Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, and Jianfei Cai. Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images.arXiv preprint arXiv:2403.14627, 2024

work page arXiv 2024
[6]

Tapir: Tracking any point with per-frame initialization and temporal refinement

Carl Doersch, Yi Yang, Mel Vecerik, Dilara Gokay, Ankush Gupta, Yusuf Aytar, Joao Carreira, and Andrew Zisserman. Tapir: Tracking any point with per-frame initialization and temporal refinement. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 10061–10072, 2023

2023
[7]

Neural radiance flow for 4d view synthesis and video processing

Yilun Du, Yinan Zhang, Hong-Xing Yu, Joshua B Tenenbaum, and Jiajun Wu. Neural radiance flow for 4d view synthesis and video processing. InICCV, pages 14304–14314. IEEE Computer Society, 2021. 12

2021
[8]

4d-rotor gaussian splatting: towards efficient novel view synthesis for dynamic scenes

Yuanxing Duan, Fangyin Wei, Qiyu Dai, Yuhang He, Wenzheng Chen, and Baoquan Chen. 4d-rotor gaussian splatting: towards efficient novel view synthesis for dynamic scenes. InACM SIGGRAPH 2024 Conference Papers, pages 1–11, 2024

2024
[9]

Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps

Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, and Zhangyang Wang. Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps.arXiv preprint arXiv:2311.17245, 2023

work page arXiv 2023
[10]

Plenoxels: Radiance fields without neural networks

Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. Plenoxels: Radiance fields without neural networks. InCVPR, pages 5501–5510, 2022

2022
[11]

Dynamic view synthesis from dynamic monocular video

Chen Gao, Ayush Saraf, Johannes Kopf, and Jia-Bin Huang. Dynamic view synthesis from dynamic monocular video. InICCV, 2021

2021
[12]

Monocular dynamic view synthesis: A reality check.Advances in Neural Information Processing Systems, 35:33768–33780, 2022

Hang Gao, Ruilong Li, Shubham Tulsiani, Bryan Russell, and Angjoo Kanazawa. Monocular dynamic view synthesis: A reality check.Advances in Neural Information Processing Systems, 35:33768–33780, 2022

2022
[13]

arXiv:2311.16043 , year =

Jian Gao, Chun Gu, Youtian Lin, Hao Zhu, Xun Cao, Li Zhang, and Yao Yao. Relightable 3d gaussian: Real-time point cloud relighting with brdf decomposition and ray tracing.arXiv preprint arXiv:2311.16043, 2023

work page arXiv 2023
[14]

Fastnerf: High-fidelity neural rendering at 200fps

Stephan J Garbin, Marek Kowalski, Matthew Johnson, Jamie Shotton, and Julien Valentin. Fastnerf: High-fidelity neural rendering at 200fps. InICCV, pages 14346–14355, 2021

2021
[15]

Eagles: Efficient accelerated 3d gaussians with lightweight encodings.arXiv preprint arXiv:2312.04564, 2023

Sharath Girish, Kamal Gupta, and Abhinav Shrivastava. Eagles: Efficient accelerated 3d gaussians with lightweight encodings.arXiv preprint arXiv:2312.04564, 2023

work page arXiv 2023
[16]

Motion-aware 3d gaussian splatting for efficient dynamic scene reconstruction.IEEE Transactions on Circuits and Systems for Video Technology, 2024

Zhiyang Guo, Wengang Zhou, Li Li, Min Wang, and Houqiang Li. Motion-aware 3d gaussian splatting for efficient dynamic scene reconstruction.IEEE Transactions on Circuits and Systems for Video Technology, 2024

2024
[17]

Efficientnerf efficient neural radiance fields

Tao Hu, Shu Liu, Yilun Chen, Tiancheng Shen, and Jiaya Jia. Efficientnerf efficient neural radiance fields. InCVPR, pages 12902–12911, 2022

2022
[18]

Vr-gs: A physical dynamics-aware interactive gaussian splatting system in virtual reality

Ying Jiang, Chang Yu, Tianyi Xie, Xuan Li, Yutao Feng, Huamin Wang, Minchen Li, Henry Lau, Feng Gao, Yin Yang, et al. Vr-gs: A physical dynamics-aware interactive gaussian splatting system in virtual reality. InACM SIGGRAPH 2024 Conference Papers, pages 1–1, 2024

2024
[19]

Gaussianshader: 3d gaussian splatting with shading functions for reflective surfaces.arXiv preprint arXiv:2311.17977, 2023

Yingwenqi Jiang, Jiadong Tu, Yuan Liu, Xifeng Gao, Xiaoxiao Long, Wenping Wang, and Yuexin Ma. Gaussianshader: 3d gaussian splatting with shading functions for reflective surfaces.arXiv preprint arXiv:2311.17977, 2023

work page arXiv 2023
[20]

An efficient 3d gaussian representation for monocular/multi-view dynamic scenes.arXiv preprint arXiv:2311.12897, 2023

Kai Katsumata, Duc Minh V o, and Hideki Nakayama. An efficient 3d gaussian representation for monocular/multi-view dynamic scenes.arXiv preprint arXiv:2311.12897, 2023

work page arXiv 2023
[21]

3d gaussian splatting for real-time radiance field rendering.ACM Trans

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1, 2023

2023
[22]

Segment anything

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. Segment anything. InProceedings of the IEEE/CVF international conference on computer vision, pages 4015–4026, 2023

2023
[23]

Mosca: Dynamic gaussian fusion from casual videos via 4d motion scaffolds.arXiv preprint arXiv:2405.17421, 2024

Jiahui Lei, Yijia Weng, Adam Harley, Leonidas Guibas, and Kostas Daniilidis. Mosca: Dynamic gaussian fusion from casual videos via 4d motion scaffolds.arXiv preprint arXiv:2405.17421, 2024

work page arXiv 2024
[24]

Neural 3d video synthesis from multi-view video

Tianye Li, Mira Slavcheva, Michael Zollhoefer, Simon Green, Christoph Lassner, Changil Kim, Tanner Schmidt, Steven Lovegrove, Michael Goesele, Richard Newcombe, et al. Neural 3d video synthesis from multi-view video. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5521–5531, 2022

2022
[25]

Neural scene flow fields for space-time view synthesis of dynamic scenes

Zhengqi Li, Simon Niklaus, Noah Snavely, and Oliver Wang. Neural scene flow fields for space-time view synthesis of dynamic scenes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6498–6508, 2021

2021
[26]

Megasam: Accurate, fast, and robust structure and motion from casual dynamic videos.arXiv preprint arXiv:2412.04463, 2024

Zhengqi Li, Richard Tucker, Forrester Cole, Qianqian Wang, Linyi Jin, Vickie Ye, Angjoo Kanazawa, Aleksander Holynski, and Noah Snavely. Megasam: Accurate, fast, and robust structure and motion from casual dynamic videos.arXiv preprint arXiv:2412.04463, 2024. 13

work page arXiv 2024
[27]

Gaufre: Gaussian deformation fields for real-time dynamic novel view synthesis

Yiqing Liang, Numair Khan, Zhengqin Li, Thu Nguyen-Phuoc, Douglas Lanman, James Tompkin, and Lei Xiao. Gaufre: Gaussian deformation fields for real-time dynamic novel view synthesis. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 2642–2652. IEEE, 2025

2025
[28]

arXiv preprint arXiv:2311.16473 , year=

Zhihao Liang, Qi Zhang, Ying Feng, Ying Shan, and Kui Jia. Gs-ir: 3d gaussian splatting for inverse rendering.arXiv preprint arXiv:2311.16473, 2023

work page arXiv 2023
[29]

Gaussian-flow: 4d reconstruction with dynamic 3d gaussian particle

Youtian Lin, Zuozhuo Dai, Siyu Zhu, and Yao Yao. Gaussian-flow: 4d reconstruction with dynamic 3d gaussian particle. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21136–21145, 2024

2024
[30]

Neural sparse voxel fields

Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. Neural sparse voxel fields. NeurIPS, 33:15651–15663, 2020

2020
[31]

Neural volumes: Learning dynamic renderable volumes from images.arXiv preprint arXiv:1906.07751, 2019

Stephen Lombardi, Tomas Simon, Jason Saragih, Gabriel Schwartz, Andreas Lehrmann, and Yaser Sheikh. Neural volumes: Learning dynamic renderable volumes from images.arXiv preprint arXiv:1906.07751, 2019

work page arXiv 1906
[32]

Scaffold-gs: Structured 3d gaussians for view-adaptive rendering.arXiv preprint arXiv:2312.00109, 2023

Tao Lu, Mulin Yu, Linning Xu, Yuanbo Xiangli, Limin Wang, Dahua Lin, and Bo Dai. Scaffold-gs: Structured 3d gaussians for view-adaptive rendering.arXiv preprint arXiv:2312.00109, 2023

work page arXiv 2023
[33]

Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis

Jonathon Luiten, Georgios Kopanas, Bastian Leibe, and Deva Ramanan. Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis. In2024 International Conference on 3D Vision (3DV), pages 800–809. IEEE, 2024

2024
[34]

Occupancy networks: Learning 3d reconstruction in function space

Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. Occupancy networks: Learning 3d reconstruction in function space. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4460–4470, 2019

2019
[35]

Nerf: Representing scenes as neural radiance fields for view synthesis.ECCV, 2020

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis.ECCV, 2020

2020
[36]

Instant neural graphics primitives with a multiresolution hash encoding.ToG, 41(4):1–15, 2022

Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primitives with a multiresolution hash encoding.ToG, 41(4):1–15, 2022

2022
[37]

Compact3d: Compressing gaussian splat radiance field models with vector quantization.arXiv preprint arXiv:2311.18159, 2023

KL Navaneet, Kossar Pourahmadi Meibodi, Soroush Abbasi Koohpayegani, and Hamed Pirsiavash. Compact3d: Compressing gaussian splat radiance field models with vector quantization.arXiv preprint arXiv:2311.18159, 2023

work page arXiv 2023
[38]

Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs

Michael Niemeyer, Jonathan T Barron, Ben Mildenhall, Mehdi SM Sajjadi, Andreas Geiger, and Noha Radwan. Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs. InCVPR, pages 5480–5490, 2022

2022
[39]

Deepsdf: Learning continuous signed distance functions for shape representation

Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. Deepsdf: Learning continuous signed distance functions for shape representation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 165–174, 2019

2019
[40]

Nerfies: Deformable neural radiance fields

Keunhong Park, Utkarsh Sinha, Jonathan T Barron, Sofien Bouaziz, Dan B Goldman, Steven M Seitz, and Ricardo Martin-Brualla. Nerfies: Deformable neural radiance fields. InProceedings of the IEEE/CVF international conference on computer vision, pages 5865–5874, 2021

2021
[41]

HyperNeRF: A Higher- Dimensional Representation for Topologically Varying Neu- 9 ral Radiance Fields.arXiv preprint arXiv:2106.13228, 2021

Keunhong Park, Utkarsh Sinha, Peter Hedman, Jonathan T Barron, Sofien Bouaziz, Dan B Goldman, Ricardo Martin-Brualla, and Steven M Seitz. Hypernerf: A higher-dimensional representation for topologi- cally varying neural radiance fields.arXiv preprint arXiv:2106.13228, 2021

work page arXiv 2021
[42]

The 2017 DAVIS Challenge on Video Object Segmentation

Jordi Pont-Tuset, Federico Perazzi, Sergi Caelles, Pablo Arbeláez, Alex Sorkine-Hornung, and Luc Van Gool. The 2017 davis challenge on video object segmentation.arXiv preprint arXiv:1704.00675, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[43]

D-nerf: Neural radiance fields for dynamic scenes

Albert Pumarola, Enric Corona, Gerard Pons-Moll, and Francesc Moreno-Noguer. D-nerf: Neural radiance fields for dynamic scenes. InCVPR, pages 10318–10327, 2021

2021
[44]

Langsplat: 3d language gaus- sian splatting

Minghan Qin, Wanhua Li, Jiawei Zhou, Haoqian Wang, and Hanspeter Pfister. Langsplat: 3d language gaus- sian splatting. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20051–20060, 2024

2024
[45]

Modgs: Dynamic gaussian splatting from casually-captured monocular videos with depth priors

LIU Qingming, Yuan Liu, Jiepeng Wang, Xianqiang Lyu, Peng Wang, Wenping Wang, and Junhui Hou. Modgs: Dynamic gaussian splatting from casually-captured monocular videos with depth priors. InThe Thirteenth International Conference on Learning Representations, 2025. 14

2025
[46]

Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps

Christian Reiser, Songyou Peng, Yiyi Liao, and Andreas Geiger. Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps. InICCV, pages 14335–14345, 2021

2021
[47]

Freeman, Joshua B

Vincent Sitzmann, Semon Rezchikov, William T. Freeman, Joshua B. Tenenbaum, and Frédo Durand. Light field networks: Neural scene representations with single-evaluation rendering. InNeurIPS, 2021

2021
[48]

Scene representation networks: Continuous 3d-structure-aware neural scene representations.NeurIPS, 32, 2019

Vincent Sitzmann, Michael Zollhofer, and Gordon Wetzstein. Scene representation networks: Continuous 3d-structure-aware neural scene representations.NeurIPS, 32, 2019

2019
[49]

Dynamic gaussian marbles for novel view synthesis of casual monocular videos

Colton Stearns, Adam Harley, Mikaela Uy, Florian Dubost, Federico Tombari, Gordon Wetzstein, and Leonidas Guibas. Dynamic gaussian marbles for novel view synthesis of casual monocular videos. In SIGGRAPH Asia 2024 Conference Papers, pages 1–11, 2024

2024
[50]

Splatter image: Ultra-fast single-view 3d reconstruction.arXiv preprint arXiv:2312.13150, 2023

Stanislaw Szymanowicz, Christian Rupprecht, and Andrea Vedaldi. Splatter image: Ultra-fast single-view 3d reconstruction.arXiv preprint arXiv:2312.13150, 2023

work page arXiv 2023
[51]

Block-nerf: Scalable large scene neural view synthesis

Matthew Tancik, Vincent Casser, Xinchen Yan, Sabeek Pradhan, Ben Mildenhall, Pratul P Srinivasan, Jonathan T Barron, and Henrik Kretzschmar. Block-nerf: Scalable large scene neural view synthesis. In CVPR, pages 8248–8258, 2022

2022
[52]

Raft: Recurrent all-pairs field transforms for optical flow

Zachary Teed and Jia Deng. Raft: Recurrent all-pairs field transforms for optical flow. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, pages 402–419. Springer, 2020

2020
[53]

Sparf: Neural radiance fields from sparse and noisy poses

Prune Truong, Marie-Julie Rakotosaona, Fabian Manhardt, and Federico Tombari. Sparf: Neural radiance fields from sparse and noisy poses. ieee. InCVPR, volume 1, 2023

2023
[54]

Mega-nerf: Scalable construction of large-scale nerfs for virtual fly-throughs

Haithem Turki, Deva Ramanan, and Mahadev Satyanarayanan. Mega-nerf: Scalable construction of large-scale nerfs for virtual fly-throughs. InCVPR, pages 12922–12931, 2022

2022
[55]

Shape of motion: 4d reconstruction from a single video.arXiv preprint arXiv:2407.13764, 2024

Qianqian Wang, Vickie Ye, Hang Gao, Jake Austin, Zhengqi Li, and Angjoo Kanazawa. Shape of motion: 4d reconstruction from a single video.arXiv preprint arXiv:2407.13764, 2024

work page arXiv 2024
[56]

4d gaussian splatting for real-time dynamic scene rendering

Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, and Xinggang Wang. 4d gaussian splatting for real-time dynamic scene rendering. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 20310–20320, 2024

2024
[57]

Diffusionerf: Regularizing neural radiance fields with denoising diffusion models

Jamie Wynn and Daniyar Turmukhambetov. Diffusionerf: Regularizing neural radiance fields with denoising diffusion models. InCVPR, pages 4180–4189, 2023

2023
[58]

Space-time neural irradiance fields for free- viewpoint video

Wenqi Xian, Jia-Bin Huang, Johannes Kopf, and Changil Kim. Space-time neural irradiance fields for free- viewpoint video. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9421–9431, 2021

2021
[59]

Bungeenerf: Progressive neural radiance field for extreme multi-scale scene rendering

Yuanbo Xiangli, Linning Xu, Xingang Pan, Nanxuan Zhao, Anyi Rao, Christian Theobalt, Bo Dai, and Dahua Lin. Bungeenerf: Progressive neural radiance field for extreme multi-scale scene rendering. In ECCV, pages 106–122. Springer, 2022

2022
[60]

Murf: Multi-baseline radiance fields.arXiv preprint arXiv:2312.04565, 2023

Haofei Xu, Anpei Chen, Yuedong Chen, Christos Sakaridis, Yulun Zhang, Marc Pollefeys, Andreas Geiger, and Fisher Yu. Murf: Multi-baseline radiance fields.arXiv preprint arXiv:2312.04565, 2023

work page arXiv 2023
[61]

Grid-guided neural radiance fields for large urban scenes

Linning Xu, Yuanbo Xiangli, Sida Peng, Xingang Pan, Nanxuan Zhao, Christian Theobalt, Bo Dai, and Dahua Lin. Grid-guided neural radiance fields for large urban scenes. InCVPR, pages 8296–8306, 2023

2023
[62]

Multi-scale 3d gaussian splatting for anti-aliased rendering.arXiv preprint arXiv:2311.17089, 2023

Zhiwen Yan, Weng Fei Low, Yu Chen, and Gim Hee Lee. Multi-scale 3d gaussian splatting for anti-aliased rendering.arXiv preprint arXiv:2311.17089, 2023

work page arXiv 2023
[63]

Real-time photorealistic dynamic scene representation and rendering with 4d gaussian splatting.ICLR, 2024

Zeyu Yang, Hongye Yang, Zijie Pan, and Li Zhang. Real-time photorealistic dynamic scene representation and rendering with 4d gaussian splatting.ICLR, 2024

2024
[64]

Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction

Ziyi Yang, Xinyu Gao, Wen Zhou, Shaohui Jiao, Yuqing Zhang, and Xiaogang Jin. Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 20331–20341, 2024

2024
[65]

Multiview neural surface reconstruction by disentangling geometry and appearance.Advances in Neural Information Processing Systems, 33:2492–2502, 2020

Lior Yariv, Yoni Kasten, Dror Moran, Meirav Galun, Matan Atzmon, Basri Ronen, and Yaron Lipman. Multiview neural surface reconstruction by disentangling geometry and appearance.Advances in Neural Information Processing Systems, 33:2492–2502, 2020. 15

2020
[66]

Novel view synthesis of dynamic scenes with globally coherent depths from a monocular camera

Jae Shin Yoon, Kihwan Kim, Orazio Gallo, Hyun Soo Park, and Jan Kautz. Novel view synthesis of dynamic scenes with globally coherent depths from a monocular camera. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5336–5345, 2020

2020
[67]

Plenoctrees for real-time rendering of neural radiance fields

Alex Yu, Ruilong Li, Matthew Tancik, Hao Li, Ren Ng, and Angjoo Kanazawa. Plenoctrees for real-time rendering of neural radiance fields. InICCV, pages 5752–5761, 2021

2021
[68]

Shengjun Zhang, Xin Fei, Fangfu Liu, Haixu Song, and Yueqi Duan. Gaussian graph network: Learn- ing efficient and generalizable gaussian representations from multi-view images.Advances in Neural Information Processing Systems, 37:50361–50380, 2024

2024
[69]

Feature 3dgs: Supercharging 3d gaussian splatting to enable distilled feature fields

Shijie Zhou, Haoran Chang, Sicheng Jiang, Zhiwen Fan, Zehao Zhu, Dejia Xu, Pradyumna Chari, Suya You, Zhangyang Wang, and Achuta Kadambi. Feature 3dgs: Supercharging 3d gaussian splatting to enable distilled feature fields. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21676–21685, 2024

2024
[70]

Drivinggaus- sian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes

Xiaoyu Zhou, Zhiwei Lin, Xiaojun Shan, Yongtao Wang, Deqing Sun, and Ming-Hsuan Yang. Drivinggaus- sian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 21634–21643, 2024. 16

2024

[1] [1]

Vision-only robot navigation in a neural radiance world.IEEE Robotics and Automation Letters, 7(2):4606–4613, 2022

Michal Adamkiewicz, Timothy Chen, Adam Caccavale, Rachel Gardner, Preston Culbertson, Jeannette Bohg, and Mac Schwager. Vision-only robot navigation in a neural radiance world.IEEE Robotics and Automation Letters, 7(2):4606–4613, 2022

2022

[2] [2]

Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields

Jonathan T Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul P Srinivasan. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. InICCV, pages 5855–5864, 2021

2021

[3] [3]

Hexplane: A fast representation for dynamic scenes

Ang Cao and Justin Johnson. Hexplane: A fast representation for dynamic scenes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 130–141, 2023

2023

[4] [4]

pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction.arXiv preprint arXiv:2312.12337, 2023

David Charatan, Sizhe Li, Andrea Tagliasacchi, and Vincent Sitzmann. pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction.arXiv preprint arXiv:2312.12337, 2023

work page arXiv 2023

[5] [5]

Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images.arXiv preprint arXiv:2403.14627, 2024

Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, and Jianfei Cai. Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images.arXiv preprint arXiv:2403.14627, 2024

work page arXiv 2024

[6] [6]

Tapir: Tracking any point with per-frame initialization and temporal refinement

Carl Doersch, Yi Yang, Mel Vecerik, Dilara Gokay, Ankush Gupta, Yusuf Aytar, Joao Carreira, and Andrew Zisserman. Tapir: Tracking any point with per-frame initialization and temporal refinement. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 10061–10072, 2023

2023

[7] [7]

Neural radiance flow for 4d view synthesis and video processing

Yilun Du, Yinan Zhang, Hong-Xing Yu, Joshua B Tenenbaum, and Jiajun Wu. Neural radiance flow for 4d view synthesis and video processing. InICCV, pages 14304–14314. IEEE Computer Society, 2021. 12

2021

[8] [8]

4d-rotor gaussian splatting: towards efficient novel view synthesis for dynamic scenes

Yuanxing Duan, Fangyin Wei, Qiyu Dai, Yuhang He, Wenzheng Chen, and Baoquan Chen. 4d-rotor gaussian splatting: towards efficient novel view synthesis for dynamic scenes. InACM SIGGRAPH 2024 Conference Papers, pages 1–11, 2024

2024

[9] [9]

Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps

Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, and Zhangyang Wang. Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps.arXiv preprint arXiv:2311.17245, 2023

work page arXiv 2023

[10] [10]

Plenoxels: Radiance fields without neural networks

Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. Plenoxels: Radiance fields without neural networks. InCVPR, pages 5501–5510, 2022

2022

[11] [11]

Dynamic view synthesis from dynamic monocular video

Chen Gao, Ayush Saraf, Johannes Kopf, and Jia-Bin Huang. Dynamic view synthesis from dynamic monocular video. InICCV, 2021

2021

[12] [12]

Monocular dynamic view synthesis: A reality check.Advances in Neural Information Processing Systems, 35:33768–33780, 2022

Hang Gao, Ruilong Li, Shubham Tulsiani, Bryan Russell, and Angjoo Kanazawa. Monocular dynamic view synthesis: A reality check.Advances in Neural Information Processing Systems, 35:33768–33780, 2022

2022

[13] [13]

arXiv:2311.16043 , year =

Jian Gao, Chun Gu, Youtian Lin, Hao Zhu, Xun Cao, Li Zhang, and Yao Yao. Relightable 3d gaussian: Real-time point cloud relighting with brdf decomposition and ray tracing.arXiv preprint arXiv:2311.16043, 2023

work page arXiv 2023

[14] [14]

Fastnerf: High-fidelity neural rendering at 200fps

Stephan J Garbin, Marek Kowalski, Matthew Johnson, Jamie Shotton, and Julien Valentin. Fastnerf: High-fidelity neural rendering at 200fps. InICCV, pages 14346–14355, 2021

2021

[15] [15]

Eagles: Efficient accelerated 3d gaussians with lightweight encodings.arXiv preprint arXiv:2312.04564, 2023

Sharath Girish, Kamal Gupta, and Abhinav Shrivastava. Eagles: Efficient accelerated 3d gaussians with lightweight encodings.arXiv preprint arXiv:2312.04564, 2023

work page arXiv 2023

[16] [16]

Motion-aware 3d gaussian splatting for efficient dynamic scene reconstruction.IEEE Transactions on Circuits and Systems for Video Technology, 2024

Zhiyang Guo, Wengang Zhou, Li Li, Min Wang, and Houqiang Li. Motion-aware 3d gaussian splatting for efficient dynamic scene reconstruction.IEEE Transactions on Circuits and Systems for Video Technology, 2024

2024

[17] [17]

Efficientnerf efficient neural radiance fields

Tao Hu, Shu Liu, Yilun Chen, Tiancheng Shen, and Jiaya Jia. Efficientnerf efficient neural radiance fields. InCVPR, pages 12902–12911, 2022

2022

[18] [18]

Vr-gs: A physical dynamics-aware interactive gaussian splatting system in virtual reality

Ying Jiang, Chang Yu, Tianyi Xie, Xuan Li, Yutao Feng, Huamin Wang, Minchen Li, Henry Lau, Feng Gao, Yin Yang, et al. Vr-gs: A physical dynamics-aware interactive gaussian splatting system in virtual reality. InACM SIGGRAPH 2024 Conference Papers, pages 1–1, 2024

2024

[19] [19]

Gaussianshader: 3d gaussian splatting with shading functions for reflective surfaces.arXiv preprint arXiv:2311.17977, 2023

Yingwenqi Jiang, Jiadong Tu, Yuan Liu, Xifeng Gao, Xiaoxiao Long, Wenping Wang, and Yuexin Ma. Gaussianshader: 3d gaussian splatting with shading functions for reflective surfaces.arXiv preprint arXiv:2311.17977, 2023

work page arXiv 2023

[20] [20]

An efficient 3d gaussian representation for monocular/multi-view dynamic scenes.arXiv preprint arXiv:2311.12897, 2023

Kai Katsumata, Duc Minh V o, and Hideki Nakayama. An efficient 3d gaussian representation for monocular/multi-view dynamic scenes.arXiv preprint arXiv:2311.12897, 2023

work page arXiv 2023

[21] [21]

3d gaussian splatting for real-time radiance field rendering.ACM Trans

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1, 2023

2023

[22] [22]

Segment anything

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. Segment anything. InProceedings of the IEEE/CVF international conference on computer vision, pages 4015–4026, 2023

2023

[23] [23]

Mosca: Dynamic gaussian fusion from casual videos via 4d motion scaffolds.arXiv preprint arXiv:2405.17421, 2024

Jiahui Lei, Yijia Weng, Adam Harley, Leonidas Guibas, and Kostas Daniilidis. Mosca: Dynamic gaussian fusion from casual videos via 4d motion scaffolds.arXiv preprint arXiv:2405.17421, 2024

work page arXiv 2024

[24] [24]

Neural 3d video synthesis from multi-view video

Tianye Li, Mira Slavcheva, Michael Zollhoefer, Simon Green, Christoph Lassner, Changil Kim, Tanner Schmidt, Steven Lovegrove, Michael Goesele, Richard Newcombe, et al. Neural 3d video synthesis from multi-view video. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5521–5531, 2022

2022

[25] [25]

Neural scene flow fields for space-time view synthesis of dynamic scenes

Zhengqi Li, Simon Niklaus, Noah Snavely, and Oliver Wang. Neural scene flow fields for space-time view synthesis of dynamic scenes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6498–6508, 2021

2021

[26] [26]

Megasam: Accurate, fast, and robust structure and motion from casual dynamic videos.arXiv preprint arXiv:2412.04463, 2024

Zhengqi Li, Richard Tucker, Forrester Cole, Qianqian Wang, Linyi Jin, Vickie Ye, Angjoo Kanazawa, Aleksander Holynski, and Noah Snavely. Megasam: Accurate, fast, and robust structure and motion from casual dynamic videos.arXiv preprint arXiv:2412.04463, 2024. 13

work page arXiv 2024

[27] [27]

Gaufre: Gaussian deformation fields for real-time dynamic novel view synthesis

Yiqing Liang, Numair Khan, Zhengqin Li, Thu Nguyen-Phuoc, Douglas Lanman, James Tompkin, and Lei Xiao. Gaufre: Gaussian deformation fields for real-time dynamic novel view synthesis. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 2642–2652. IEEE, 2025

2025

[28] [28]

arXiv preprint arXiv:2311.16473 , year=

Zhihao Liang, Qi Zhang, Ying Feng, Ying Shan, and Kui Jia. Gs-ir: 3d gaussian splatting for inverse rendering.arXiv preprint arXiv:2311.16473, 2023

work page arXiv 2023

[29] [29]

Gaussian-flow: 4d reconstruction with dynamic 3d gaussian particle

Youtian Lin, Zuozhuo Dai, Siyu Zhu, and Yao Yao. Gaussian-flow: 4d reconstruction with dynamic 3d gaussian particle. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21136–21145, 2024

2024

[30] [30]

Neural sparse voxel fields

Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. Neural sparse voxel fields. NeurIPS, 33:15651–15663, 2020

2020

[31] [31]

Neural volumes: Learning dynamic renderable volumes from images.arXiv preprint arXiv:1906.07751, 2019

Stephen Lombardi, Tomas Simon, Jason Saragih, Gabriel Schwartz, Andreas Lehrmann, and Yaser Sheikh. Neural volumes: Learning dynamic renderable volumes from images.arXiv preprint arXiv:1906.07751, 2019

work page arXiv 1906

[32] [32]

Scaffold-gs: Structured 3d gaussians for view-adaptive rendering.arXiv preprint arXiv:2312.00109, 2023

Tao Lu, Mulin Yu, Linning Xu, Yuanbo Xiangli, Limin Wang, Dahua Lin, and Bo Dai. Scaffold-gs: Structured 3d gaussians for view-adaptive rendering.arXiv preprint arXiv:2312.00109, 2023

work page arXiv 2023

[33] [33]

Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis

Jonathon Luiten, Georgios Kopanas, Bastian Leibe, and Deva Ramanan. Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis. In2024 International Conference on 3D Vision (3DV), pages 800–809. IEEE, 2024

2024

[34] [34]

Occupancy networks: Learning 3d reconstruction in function space

Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. Occupancy networks: Learning 3d reconstruction in function space. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4460–4470, 2019

2019

[35] [35]

Nerf: Representing scenes as neural radiance fields for view synthesis.ECCV, 2020

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis.ECCV, 2020

2020

[36] [36]

Instant neural graphics primitives with a multiresolution hash encoding.ToG, 41(4):1–15, 2022

Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primitives with a multiresolution hash encoding.ToG, 41(4):1–15, 2022

2022

[37] [37]

Compact3d: Compressing gaussian splat radiance field models with vector quantization.arXiv preprint arXiv:2311.18159, 2023

KL Navaneet, Kossar Pourahmadi Meibodi, Soroush Abbasi Koohpayegani, and Hamed Pirsiavash. Compact3d: Compressing gaussian splat radiance field models with vector quantization.arXiv preprint arXiv:2311.18159, 2023

work page arXiv 2023

[38] [38]

Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs

Michael Niemeyer, Jonathan T Barron, Ben Mildenhall, Mehdi SM Sajjadi, Andreas Geiger, and Noha Radwan. Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs. InCVPR, pages 5480–5490, 2022

2022

[39] [39]

Deepsdf: Learning continuous signed distance functions for shape representation

Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. Deepsdf: Learning continuous signed distance functions for shape representation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 165–174, 2019

2019

[40] [40]

Nerfies: Deformable neural radiance fields

Keunhong Park, Utkarsh Sinha, Jonathan T Barron, Sofien Bouaziz, Dan B Goldman, Steven M Seitz, and Ricardo Martin-Brualla. Nerfies: Deformable neural radiance fields. InProceedings of the IEEE/CVF international conference on computer vision, pages 5865–5874, 2021

2021

[41] [41]

HyperNeRF: A Higher- Dimensional Representation for Topologically Varying Neu- 9 ral Radiance Fields.arXiv preprint arXiv:2106.13228, 2021

Keunhong Park, Utkarsh Sinha, Peter Hedman, Jonathan T Barron, Sofien Bouaziz, Dan B Goldman, Ricardo Martin-Brualla, and Steven M Seitz. Hypernerf: A higher-dimensional representation for topologi- cally varying neural radiance fields.arXiv preprint arXiv:2106.13228, 2021

work page arXiv 2021

[42] [42]

The 2017 DAVIS Challenge on Video Object Segmentation

Jordi Pont-Tuset, Federico Perazzi, Sergi Caelles, Pablo Arbeláez, Alex Sorkine-Hornung, and Luc Van Gool. The 2017 davis challenge on video object segmentation.arXiv preprint arXiv:1704.00675, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[43] [43]

D-nerf: Neural radiance fields for dynamic scenes

Albert Pumarola, Enric Corona, Gerard Pons-Moll, and Francesc Moreno-Noguer. D-nerf: Neural radiance fields for dynamic scenes. InCVPR, pages 10318–10327, 2021

2021

[44] [44]

Langsplat: 3d language gaus- sian splatting

Minghan Qin, Wanhua Li, Jiawei Zhou, Haoqian Wang, and Hanspeter Pfister. Langsplat: 3d language gaus- sian splatting. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20051–20060, 2024

2024

[45] [45]

Modgs: Dynamic gaussian splatting from casually-captured monocular videos with depth priors

LIU Qingming, Yuan Liu, Jiepeng Wang, Xianqiang Lyu, Peng Wang, Wenping Wang, and Junhui Hou. Modgs: Dynamic gaussian splatting from casually-captured monocular videos with depth priors. InThe Thirteenth International Conference on Learning Representations, 2025. 14

2025

[46] [46]

Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps

Christian Reiser, Songyou Peng, Yiyi Liao, and Andreas Geiger. Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps. InICCV, pages 14335–14345, 2021

2021

[47] [47]

Freeman, Joshua B

Vincent Sitzmann, Semon Rezchikov, William T. Freeman, Joshua B. Tenenbaum, and Frédo Durand. Light field networks: Neural scene representations with single-evaluation rendering. InNeurIPS, 2021

2021

[48] [48]

Scene representation networks: Continuous 3d-structure-aware neural scene representations.NeurIPS, 32, 2019

Vincent Sitzmann, Michael Zollhofer, and Gordon Wetzstein. Scene representation networks: Continuous 3d-structure-aware neural scene representations.NeurIPS, 32, 2019

2019

[49] [49]

Dynamic gaussian marbles for novel view synthesis of casual monocular videos

Colton Stearns, Adam Harley, Mikaela Uy, Florian Dubost, Federico Tombari, Gordon Wetzstein, and Leonidas Guibas. Dynamic gaussian marbles for novel view synthesis of casual monocular videos. In SIGGRAPH Asia 2024 Conference Papers, pages 1–11, 2024

2024

[50] [50]

Splatter image: Ultra-fast single-view 3d reconstruction.arXiv preprint arXiv:2312.13150, 2023

Stanislaw Szymanowicz, Christian Rupprecht, and Andrea Vedaldi. Splatter image: Ultra-fast single-view 3d reconstruction.arXiv preprint arXiv:2312.13150, 2023

work page arXiv 2023

[51] [51]

Block-nerf: Scalable large scene neural view synthesis

Matthew Tancik, Vincent Casser, Xinchen Yan, Sabeek Pradhan, Ben Mildenhall, Pratul P Srinivasan, Jonathan T Barron, and Henrik Kretzschmar. Block-nerf: Scalable large scene neural view synthesis. In CVPR, pages 8248–8258, 2022

2022

[52] [52]

Raft: Recurrent all-pairs field transforms for optical flow

Zachary Teed and Jia Deng. Raft: Recurrent all-pairs field transforms for optical flow. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, pages 402–419. Springer, 2020

2020

[53] [53]

Sparf: Neural radiance fields from sparse and noisy poses

Prune Truong, Marie-Julie Rakotosaona, Fabian Manhardt, and Federico Tombari. Sparf: Neural radiance fields from sparse and noisy poses. ieee. InCVPR, volume 1, 2023

2023

[54] [54]

Mega-nerf: Scalable construction of large-scale nerfs for virtual fly-throughs

Haithem Turki, Deva Ramanan, and Mahadev Satyanarayanan. Mega-nerf: Scalable construction of large-scale nerfs for virtual fly-throughs. InCVPR, pages 12922–12931, 2022

2022

[55] [55]

Shape of motion: 4d reconstruction from a single video.arXiv preprint arXiv:2407.13764, 2024

Qianqian Wang, Vickie Ye, Hang Gao, Jake Austin, Zhengqi Li, and Angjoo Kanazawa. Shape of motion: 4d reconstruction from a single video.arXiv preprint arXiv:2407.13764, 2024

work page arXiv 2024

[56] [56]

4d gaussian splatting for real-time dynamic scene rendering

Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, and Xinggang Wang. 4d gaussian splatting for real-time dynamic scene rendering. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 20310–20320, 2024

2024

[57] [57]

Diffusionerf: Regularizing neural radiance fields with denoising diffusion models

Jamie Wynn and Daniyar Turmukhambetov. Diffusionerf: Regularizing neural radiance fields with denoising diffusion models. InCVPR, pages 4180–4189, 2023

2023

[58] [58]

Space-time neural irradiance fields for free- viewpoint video

Wenqi Xian, Jia-Bin Huang, Johannes Kopf, and Changil Kim. Space-time neural irradiance fields for free- viewpoint video. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9421–9431, 2021

2021

[59] [59]

Bungeenerf: Progressive neural radiance field for extreme multi-scale scene rendering

Yuanbo Xiangli, Linning Xu, Xingang Pan, Nanxuan Zhao, Anyi Rao, Christian Theobalt, Bo Dai, and Dahua Lin. Bungeenerf: Progressive neural radiance field for extreme multi-scale scene rendering. In ECCV, pages 106–122. Springer, 2022

2022

[60] [60]

Murf: Multi-baseline radiance fields.arXiv preprint arXiv:2312.04565, 2023

Haofei Xu, Anpei Chen, Yuedong Chen, Christos Sakaridis, Yulun Zhang, Marc Pollefeys, Andreas Geiger, and Fisher Yu. Murf: Multi-baseline radiance fields.arXiv preprint arXiv:2312.04565, 2023

work page arXiv 2023

[61] [61]

Grid-guided neural radiance fields for large urban scenes

Linning Xu, Yuanbo Xiangli, Sida Peng, Xingang Pan, Nanxuan Zhao, Christian Theobalt, Bo Dai, and Dahua Lin. Grid-guided neural radiance fields for large urban scenes. InCVPR, pages 8296–8306, 2023

2023

[62] [62]

Multi-scale 3d gaussian splatting for anti-aliased rendering.arXiv preprint arXiv:2311.17089, 2023

Zhiwen Yan, Weng Fei Low, Yu Chen, and Gim Hee Lee. Multi-scale 3d gaussian splatting for anti-aliased rendering.arXiv preprint arXiv:2311.17089, 2023

work page arXiv 2023

[63] [63]

Real-time photorealistic dynamic scene representation and rendering with 4d gaussian splatting.ICLR, 2024

Zeyu Yang, Hongye Yang, Zijie Pan, and Li Zhang. Real-time photorealistic dynamic scene representation and rendering with 4d gaussian splatting.ICLR, 2024

2024

[64] [64]

Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction

Ziyi Yang, Xinyu Gao, Wen Zhou, Shaohui Jiao, Yuqing Zhang, and Xiaogang Jin. Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 20331–20341, 2024

2024

[65] [65]

Multiview neural surface reconstruction by disentangling geometry and appearance.Advances in Neural Information Processing Systems, 33:2492–2502, 2020

Lior Yariv, Yoni Kasten, Dror Moran, Meirav Galun, Matan Atzmon, Basri Ronen, and Yaron Lipman. Multiview neural surface reconstruction by disentangling geometry and appearance.Advances in Neural Information Processing Systems, 33:2492–2502, 2020. 15

2020

[66] [66]

Novel view synthesis of dynamic scenes with globally coherent depths from a monocular camera

Jae Shin Yoon, Kihwan Kim, Orazio Gallo, Hyun Soo Park, and Jan Kautz. Novel view synthesis of dynamic scenes with globally coherent depths from a monocular camera. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5336–5345, 2020

2020

[67] [67]

Plenoctrees for real-time rendering of neural radiance fields

Alex Yu, Ruilong Li, Matthew Tancik, Hao Li, Ren Ng, and Angjoo Kanazawa. Plenoctrees for real-time rendering of neural radiance fields. InICCV, pages 5752–5761, 2021

2021

[68] [68]

Shengjun Zhang, Xin Fei, Fangfu Liu, Haixu Song, and Yueqi Duan. Gaussian graph network: Learn- ing efficient and generalizable gaussian representations from multi-view images.Advances in Neural Information Processing Systems, 37:50361–50380, 2024

2024

[69] [69]

Feature 3dgs: Supercharging 3d gaussian splatting to enable distilled feature fields

Shijie Zhou, Haoran Chang, Sicheng Jiang, Zhiwen Fan, Zehao Zhu, Dejia Xu, Pradyumna Chari, Suya You, Zhangyang Wang, and Achuta Kadambi. Feature 3dgs: Supercharging 3d gaussian splatting to enable distilled feature fields. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21676–21685, 2024

2024

[70] [70]

Drivinggaus- sian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes

Xiaoyu Zhou, Zhiwei Lin, Xiaojun Shan, Yongtao Wang, Deqing Sun, and Ming-Hsuan Yang. Drivinggaus- sian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 21634–21643, 2024. 16

2024