GaussianFlow SLAM: Monocular Gaussian Splatting SLAM Guided by GaussianFlow
Pith reviewed 2026-05-10 08:51 UTC · model grok-4.3
The pith
GaussianFlow SLAM aligns projected Gaussian motion with optical flow to regularize monocular 3D Gaussian splatting SLAM, yielding better map quality and pose accuracy than prior methods.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Experiments conducted on public datasets demonstrate that our method achieves superior rendering quality and tracking accuracy compared with state-of-the-art algorithms.
Load-bearing premise
That aligning the projected motion of Gaussians (GaussianFlow) with optical flow provides consistent structural cues sufficient to regularize both map reconstruction and pose estimation and avoid local minima in monocular settings.
Figures
read the original abstract
Gaussian splatting has recently gained traction as a compelling map representation for SLAM systems, enabling dense and photo-realistic scene modeling. However, its application to monocular SLAM remains challenging due to the lack of reliable geometric cues from monocular input. Without geometric supervision, mapping or tracking could fall in local-minima, resulting in structural degeneracies and inaccuracies. To address this challenge, we propose GaussianFlow SLAM, a monocular 3DGS-SLAM that leverages optical flow as a geometry-aware cue to guide the optimization of both the scene structure and camera poses. By encouraging the projected motion of Gaussians, termed GaussianFlow, to align with the optical flow, our method introduces consistent structural cues to regularize both map reconstruction and pose estimation. Furthermore, we introduce normalized error-based densification and pruning modules to refine inactive and unstable Gaussians, thereby contributing to improved map quality and pose accuracy. Experiments conducted on public datasets demonstrate that our method achieves superior rendering quality and tracking accuracy compared with state-of-the-art algorithms. The source code is available at: https://github.com/url-kaist/gaussianflow-slam.
Editorial analysis
A structured set of objections, weighed in public.
Axiom & Free-Parameter Ledger
invented entities (1)
-
GaussianFlow
no independent evidence
Reference graph
Works this paper leans on
-
[1]
VINS-Mono: A robust and versatile monocular visual-inertial state estimator,
T. Qin, P. Li, and S. Shen, “VINS-Mono: A robust and versatile monocular visual-inertial state estimator,”IEEE Trans. Robot., vol. 34, no. 4, pp. 1004–1020, 2018
work page 2018
-
[2]
ORB-SLAM3: An accurate open-source library for visual, visual–inertial, and multi-map SLAM,
C. Campos, R. Elvira, J. J. G. Rodr ´ıguez, J. M. Montiel, and J. D. Tard´os, “ORB-SLAM3: An accurate open-source library for visual, visual–inertial, and multi-map SLAM,”IEEE Trans. Robot., vol. 37, no. 6, pp. 1874–1890, 2021
work page 2021
-
[3]
UV-SLAM: Unconstrained line-based SLAM using vanishing points for structural mapping,
H. Lim, J. Jeon, and H. Myung, “UV-SLAM: Unconstrained line-based SLAM using vanishing points for structural mapping,”IEEE Robot. Automat. Lett., vol. 7, no. 2, pp. 1518–1525, 2022
work page 2022
-
[4]
J. Engel, V . Koltun, and D. Cremers, “Direct sparse odometry,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 3, pp. 611–625, 2017
work page 2017
-
[5]
NeRF: Representing scenes as neural radiance fields for view synthesis,
B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “NeRF: Representing scenes as neural radiance fields for view synthesis,”Commun. ACM, vol. 65, no. 1, pp. 99–106, 2021
work page 2021
-
[6]
Mip-NeRF 360: Unbounded anti-aliased neural radiance fields,
J. T. Barron, B. Mildenhall, D. Verbin, P. P. Srinivasan, and P. Hedman, “Mip-NeRF 360: Unbounded anti-aliased neural radiance fields,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 5470– 5479
work page 2022
-
[7]
3D Gaussian splatting for real-time radiance field rendering
B. Kerbl, G. Kopanas, T. Leimk ¨uhler, and G. Drettakis, “3D Gaussian splatting for real-time radiance field rendering.”ACM Trans. Graph., vol. 42, no. 4, pp. 1–14, 2023
work page 2023
-
[8]
SplaTAM: Splat track & map 3D Gaussians for dense RGB-D SLAM,
N. Keetha, J. Karhade, K. M. Jatavallabhula, G. Yang, S. Scherer, D. Ramanan, and J. Luiten, “SplaTAM: Splat track & map 3D Gaussians for dense RGB-D SLAM,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2024, pp. 21 357–21 366
work page 2024
-
[9]
GS- SLAM: Dense visual SLAM with 3D Gaussian splatting,
C. Yan, D. Qu, D. Xu, B. Zhao, Z. Wang, D. Wang, and X. Li, “GS- SLAM: Dense visual SLAM with 3D Gaussian splatting,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2024, pp. 19 595– 19 604
work page 2024
-
[10]
RTG- SLAM: Real-time 3D reconstruction at scale using Gaussian splatting,
Z. Peng, T. Shao, Y . Liu, J. Zhou, Y . Yang, J. Wang, and K. Zhou, “RTG- SLAM: Real-time 3D reconstruction at scale using Gaussian splatting,” inProc. ACM SIGGRAPH, 2024, pp. 1–11
work page 2024
-
[11]
H. Matsuki, R. Murai, P. H. J. Kelly, and A. J. Davison, “Gaussian splat- ting SLAM,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2024, pp. 18 039–18 048
work page 2024
-
[12]
Gaussian-LIC: Photo-realistic LiDAR-inertial-camera SLAM with 3D Gaussian splatting,
X. Lang, L. Li, H. Zhang, F. Xiong, M. Xu, Y . Liu, X. Zuo, and J. Lv, “Gaussian-LIC: Photo-realistic LiDAR-inertial-camera SLAM with 3D Gaussian splatting,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2024, pp. 8500–8507
work page 2024
-
[13]
L. C. Sun, N. P. Bhatt, J. C. Liu, Z. Fan, Z. Wang, T. E. Humphreys, and U. Topcu, “MM3DGS-SLAM: Multi-modal 3D Gaussian splatting for SLAM using vision, depth, and inertial measurements,” inProc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2024, pp. 10 159–10 166
work page 2024
-
[14]
RGB-only Gaussian splatting SLAM for unbounded outdoor scenes,
S. Yu, C. Cheng, Y . Zhou, X. Yang, and H. Wang, “RGB-only Gaussian splatting SLAM for unbounded outdoor scenes,” inProc. IEEE Int. Conf. Robot. Automat., 2025
work page 2025
-
[15]
HI-SLAM2: Geometry-aware Gaussian SLAM for fast monocular scene reconstruction,
W. Zhang, Q. Cheng, D. Skuddis, N. Zeller, D. Cremers, and N. Haala, “HI-SLAM2: Geometry-aware Gaussian SLAM for fast monocular scene reconstruction,”IEEE Trans. Robot., vol. 41, pp. 6478–6493, 2025
work page 2025
-
[16]
WildGS- SLAM: Monocular Gaussian splatting SLAM in dynamic environments,
J. Zheng, Z. Zhu, V . Bieri, M. Pollefeys, S. Peng, and A. Iro, “WildGS- SLAM: Monocular Gaussian splatting SLAM in dynamic environments,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2025, pp. 11 461–11 471
work page 2025
-
[17]
C. Schmidt, J. Piekenbrinck, and B. Leibe, “Look Gauss, No Pose: Novel view synthesis using Gaussian splatting without accurate pose initialization,” inProc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2024, pp. 8732–8739
work page 2024
-
[18]
H. Huang, L. Li, C. Hui, and S.-K. Yeung, “Photo-SLAM: Real-time simultaneous localization and photorealistic mapping for monocular, stereo, and RGB-D cameras,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2024, pp. 21 584–21 593
work page 2024
-
[19]
The EuRoC micro aerial vehicle datasets,
M. Burri, J. Nikolic, P. Gohl, T. Schneider, J. Rehder, S. Omari, M. W. Achtelik, and R. Siegwart, “The EuRoC micro aerial vehicle datasets,” Int. J. Robot. Res., vol. 35, no. 10, pp. 1157–1163, 2016
work page 2016
-
[20]
A benchmark for the evaluation of RGB-D SLAM systems,
J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, “A benchmark for the evaluation of RGB-D SLAM systems,” inProc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2012, pp. 573–580
work page 2012
-
[21]
NICE-SLAM: Neural implicit scalable encoding for SLAM,
Z. Zhu, S. Peng, V . Larsson, W. Xu, H. Bao, Z. Cui, M. R. Oswald, and M. Pollefeys, “NICE-SLAM: Neural implicit scalable encoding for SLAM,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 12 786–12 796
work page 2022
-
[22]
NICER-SLAM: Neural implicit scene encoding for RGB SLAM,
Z. Zhu, S. Peng, V . Larsson, Z. Cui, M. R. Oswald, A. Geiger, and M. Pollefeys, “NICER-SLAM: Neural implicit scene encoding for RGB SLAM,” inProc. IEEE Int. Conf. 3D Vis., 2024, pp. 42–52
work page 2024
-
[23]
NeRF-SLAM: Real-time dense monocular SLAM with neural radiance fields,
A. Rosinol, J. J. Leonard, and L. Carlone, “NeRF-SLAM: Real-time dense monocular SLAM with neural radiance fields,” inProc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2023, pp. 3437–3444
work page 2023
-
[24]
OpenVINS: A research platform for visual-inertial estimation,
P. Geneva, K. Eckenhoff, W. Lee, Y . Yang, and G. Huang, “OpenVINS: A research platform for visual-inertial estimation,” inProc. IEEE Int. Conf. Robot. Automat., 2020, pp. 4666–4672
work page 2020
-
[25]
An iterative image registration technique with an application to stereo vision,
B. D. Lucas and T. Kanade, “An iterative image registration technique with an application to stereo vision,” inProc. Int. Joint Conf. Artif. Intell., 1981, pp. 674–679
work page 1981
-
[26]
TartanVO: A generalizable learning- based VO,
W. Wang, Y . Hu, and S. Scherer, “TartanVO: A generalizable learning- based VO,” inProc. Conf. Robot Learn., 2021, pp. 1761–1772
work page 2021
-
[27]
VOLDOR: Visual odometry from log- logistic dense optical flow residuals,
Z. Min, Y . Yang, and E. Dunn, “VOLDOR: Visual odometry from log- logistic dense optical flow residuals,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 4898–4909
work page 2020
-
[28]
DROID-SLAM: Deep visual SLAM for monocu- lar, stereo, and RGB-D cameras,
Z. Teed and J. Deng, “DROID-SLAM: Deep visual SLAM for monocu- lar, stereo, and RGB-D cameras,”Adv. Neural Inf. Process. Syst., vol. 34, pp. 16 558–16 569, 2021
work page 2021
-
[29]
Z. Teed, L. Lipson, and J. Deng, “Deep patch visual odometry,”Adv. Neural Inf. Process. Syst., vol. 36, pp. 39 033–39 051, 2023
work page 2023
-
[30]
RAFT: Recurrent all-pairs field transforms for optical flow,
Z. Teed and J. Deng, “RAFT: Recurrent all-pairs field transforms for optical flow,” inProc. Eur. Conf. Comput. Vis., 2020, pp. 402–419
work page 2020
-
[31]
Gaussianflow: Splatting gaussian dynamics for 4d content creation.arXiv preprint arXiv:2403.12365,
Q. Gao, Q. Xu, Z. Cao, B. Mildenhall, W. Ma, L. Chen, D. Tang, and U. Neumann, “GaussianFlow: Splatting Gaussian dynamics for 4D content creation,”arXiv preprint arXiv:2403.12365, 2024
-
[32]
MotionGS: Exploring explicit motion guidance for deformable 3D Gaussian splatting,
R. Zhu, Y . Liang, H. Chang, J. Deng, J. Lu, W. Yang, T. Zhang, and Y . Zhang, “MotionGS: Exploring explicit motion guidance for deformable 3D Gaussian splatting,”Adv. Neural Inf. Process. Syst., vol. 37, pp. 101 790–101 817, 2024
work page 2024
-
[33]
Gaussian-Flow: 4D reconstruction with dynamic 3D Gaussian particle,
Y . Lin, Z. Dai, S. Zhu, and Y . Yao, “Gaussian-Flow: 4D reconstruction with dynamic 3D Gaussian particle,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2024, pp. 21 136–21 145
work page 2024
-
[34]
HUGS: Holistic urban 3D scene understanding via Gaussian splatting,
H. Zhou, J. Shao, L. Xu, D. Bai, W. Qiu, B. Liu, Y . Wang, A. Geiger, and Y . Liao, “HUGS: Holistic urban 3D scene understanding via Gaussian splatting,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2024, pp. 21 336–21 345
work page 2024
-
[35]
PyTorch: An imperative style, high-performance deep learning library,
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga,et al., “PyTorch: An imperative style, high-performance deep learning library,”Adv. Neural Inf. Process. Syst., vol. 32, pp. 8024–8035, 2019
work page 2019
-
[36]
Closed-form expres- sions of the eigen decomposition of2×2and3×3Hermitian matrices,
C.-A. Deledalle, L. Denis, S. Tabti, and F. Tupin, “Closed-form expres- sions of the eigen decomposition of2×2and3×3Hermitian matrices,” Ph.D. dissertation, Universit ´e de Lyon, 2017
work page 2017
-
[37]
A micro lie theory for state estimation in robotics,
J. Sola, J. Deray, and D. Atchuthan, “A micro Lie theory for state estimation in robotics,”arXiv preprint arXiv:1812.01537, 2018
-
[38]
Bun- dle adjustment - a modern synthesis,
B. Triggs, P. F. McLauchlan, R. I. Hartley, and A. W. Fitzgibbon, “Bun- dle adjustment - a modern synthesis,” inProc. International workshop on vision algorithms, 1999, pp. 298–372
work page 1999
-
[39]
4D-rotor Gaussian splatting: Towards efficient novel view synthesis for dynamic scenes,
Y . Duan, F. Wei, Q. Dai, Y . He, W. Chen, and B. Chen, “4D-rotor Gaussian splatting: Towards efficient novel view synthesis for dynamic scenes,” inProc. ACM SIGGRAPH, 2024, pp. 1–11
work page 2024
-
[40]
Revising densification in Gaussian splatting,
S. Rota Bul `o, L. Porzi, and P. Kontschieder, “Revising densification in Gaussian splatting,” inProc. Eur. Conf. Comput. Vis.Springer, 2024, pp. 347–362
work page 2024
-
[41]
Dgs-slam: Gaussian splatting slam in dynamic environment,
M. Kong, J. Lee, S. Lee, and E. Kim, “DGS-SLAM: Gaussian splatting SLAM in dynamic environment,”arXiv preprint arXiv:2411.10722, 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.