Spatiotemporal Degradation-Aware 3D Gaussian Splatting for Realistic Underwater Scene Reconstruction
Pith reviewed 2026-05-08 06:37 UTC · model grok-4.3
The pith
Paired intrinsic and degraded Gaussians enable self-supervised reconstruction of clean underwater scenes from degraded videos.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By introducing paired Intrinsic Gaussians representing the true scene and Degraded Gaussians that render observations, with the latter's colors physically derived from the former via a Spatiotemporal Degradation Modeling module, the framework achieves self-supervised disentanglement of realistic appearance from underwater degraded images, supported by depth-guided geometry loss and multi-stage optimization for accurate reconstruction.
What carries the argument
Paired Intrinsic Gaussians and Degraded Gaussians linked by the Spatiotemporal Degradation Modeling (SDM) module that physically derives degraded colors from intrinsic ones.
If this is right
- Novel view synthesis produces realistic, water-free scene appearances from underwater videos.
- Both spatial and temporal degradations are modeled explicitly in a unified framework.
- Training remains self-supervised without requiring ground-truth clean data.
- Improved geometry accuracy through proposed depth-guided loss and optimization strategy.
- Outperforms prior methods on simulated benchmarks and real-world underwater datasets.
Where Pith is reading between the lines
- This paired Gaussian approach could be adapted to other imaging domains with known physical degradation processes, such as haze or fog.
- The disentangled intrinsic representations may improve performance in underwater computer vision tasks like object tracking.
- Further development could incorporate dynamic elements for reconstructing moving objects in underwater scenes.
- Validation on additional real-world datasets with varying water conditions would test the generality of the degradation model.
Load-bearing premise
The physical derivation of degraded colors from intrinsic colors through the SDM module accurately captures the observed degradations without needing supervised clean data.
What would settle it
Training the model on the simulated benchmark with known ground-truth clean appearances and then checking whether novel-view outputs match those clean references would settle whether the disentanglement succeeds.
Figures
read the original abstract
Reconstructing realistic underwater scenes from underwater video remains a meaningful yet challenging task in the multimedia domain. The inherent spatiotemporal degradations in underwater imaging, including caustics, flickering, attenuation, and backscattering, frequently result in inaccurate geometry and appearance in existing 3D reconstruction methods. While a few recent works have explored underwater degradation-aware reconstruction, they often address either spatial or temporal degradation alone, falling short in more real-world underwater scenarios where both types of degradation occur. We propose MarineSTD-GS, a novel 3D Gaussian Splatting-based framework that explicitly models both temporal and spatial degradations for realistic underwater scene reconstruction. Specifically, we introduce two paired Gaussian primitives: Intrinsic Gaussians represent the true scene, while Degraded Gaussians render the degraded observations. The color of each Degraded Gaussian is physically derived from its paired Intrinsic Gaussian via a Spatiotemporal Degradation Modeling (SDM) module, enabling self-supervised disentanglement of realistic appearance from degraded images. To ensure stable training and accurate geometry, we further propose a Depth-Guided Geometry Loss and a Multi-Stage Optimization strategy. We also construct a simulated benchmark with diverse spatial and temporal degradations and ground-truth appearances for comprehensive evaluation. Experiments on both simulated and real-world datasets show that MarineSTD-GS robustly handles spatiotemporal degradations and outperforms existing methods in novel view synthesis with realistic, water-free scene appearances.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes MarineSTD-GS, a 3D Gaussian Splatting framework for underwater scene reconstruction from video. It introduces paired Intrinsic Gaussians (representing the true scene) and Degraded Gaussians (rendering observations), with the color of each Degraded Gaussian physically derived from its paired Intrinsic Gaussian via a Spatiotemporal Degradation Modeling (SDM) module. This enables self-supervised disentanglement of realistic appearance from spatiotemporal degradations (caustics, flickering, attenuation, backscattering). The method adds a Depth-Guided Geometry Loss and Multi-Stage Optimization for stable training and accurate geometry, constructs a simulated benchmark with ground-truth appearances, and reports improved novel view synthesis on both simulated and real-world datasets compared to existing methods.
Significance. If the self-supervised disentanglement via the SDM module is shown to produce unique and stable separation without ground-truth clean data, the work would meaningfully advance underwater 3D reconstruction by jointly handling spatial and temporal degradations in a unified 3DGS pipeline. The simulated benchmark with diverse degradations and ground-truth appearances is a concrete positive contribution that could support reproducible evaluation in the field.
major comments (1)
- [Abstract] Abstract (central claim on SDM): The assertion that the SDM module 'physically derives' Degraded Gaussian colors from Intrinsic Gaussians to enable self-supervised disentanglement is load-bearing for the entire contribution, yet the abstract provides no equations, constraints, or analysis demonstrating that the joint optimization of paired Gaussians plus SDM parameters yields a unique solution. Without such grounding, the Depth-Guided Geometry Loss may not prevent compensatory solutions in which degradation parameters absorb scene structure while Intrinsic Gaussians remain under-constrained.
minor comments (1)
- The abstract refers to 'physically derived' colors but does not specify the exact degradation model components (e.g., attenuation, backscattering terms) or how they are parameterized, which reduces clarity for readers.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. The primary concern raised centers on the need for stronger grounding of the SDM module's physical derivation claim and its ability to ensure unique disentanglement. We address this below and have revised the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract (central claim on SDM): The assertion that the SDM module 'physically derives' Degraded Gaussian colors from Intrinsic Gaussians to enable self-supervised disentanglement is load-bearing for the entire contribution, yet the abstract provides no equations, constraints, or analysis demonstrating that the joint optimization of paired Gaussians plus SDM parameters yields a unique solution. Without such grounding, the Depth-Guided Geometry Loss may not prevent compensatory solutions in which degradation parameters absorb scene structure while Intrinsic Gaussians remain under-constrained.
Authors: We agree that the abstract, due to length constraints, does not include equations. The full manuscript (Section 3.2) specifies the SDM equations: degraded color is computed as C_d = T(d,t) * C_i + B(d,t) + C(t), where T is the transmission (exponential attenuation with depth d and time t), B is backscattering, and C captures caustics/flickering as additive temporal terms derived from physical underwater imaging models. These are not free parameters but are constrained by known physical priors (e.g., Beer-Lambert law for attenuation, wavelength-dependent scattering). The paired Intrinsic/Degraded Gaussians share the same 3D position and covariance, so any attempt by degradation parameters to absorb scene structure would violate multi-view and multi-frame consistency enforced by the rendering loss. The Depth-Guided Geometry Loss further anchors geometry by supervising rendered depths against monocular depth estimates, preventing the Intrinsic Gaussians from becoming under-constrained. We have expanded the abstract with a one-sentence reference to these physical constraints and added a short uniqueness discussion (with ablation on alternative degradation formulations) to Section 4.3 in the revision. revision: partial
Circularity Check
No significant circularity; derivation relies on independent modeling choices and constraints
full rationale
The paper's central mechanism introduces paired Intrinsic/Degraded Gaussians with an SDM module that applies a physical degradation model to derive colors, plus explicit Depth-Guided Geometry Loss and Multi-Stage Optimization. These are modeling decisions and optimization constraints, not reductions of the output to the input by definition or self-citation. The self-supervised disentanglement claim is supported by the simulated benchmark containing ground-truth clean appearances, which provides an external check independent of the fitted parameters. No equations or steps in the provided text equate the final reconstruction to a tautological fit or rename a known result; the framework adds new components (paired primitives, SDM, geometry loss) whose validity is tested rather than assumed by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Spatiotemporal degradations (caustics, flickering, attenuation, backscattering) can be physically modeled to derive degraded colors from intrinsic scene colors
invented entities (2)
-
Intrinsic Gaussians
no independent evidence
-
Degraded Gaussians
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Panagiotis Agrafiotis, Dimitrios Skarlatos, Timothy Forbes, Charalambos Poullis, Margarita Skamantzari, and Andreas Georgopoulos. 2018. Underwater pho- togrammetry in very shallow waters: main challenges and caustics effect removal. (2018)
work page 2018
-
[2]
Derya Akkaynak and Tali Treibitz. 2018. A revised underwater image formation model. InProceedings of the IEEE conference on computer vision and pattern recognition. 6723–6732
work page 2018
-
[3]
Derya Akkaynak and Tali Treibitz. 2019. Sea-thru: A method for removing water from underwater images. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1682–1691
work page 2019
-
[4]
Nantheera Anantrasirichai. 2024. BVI-Coral: Underwater scenes for 3D recon- struction. doi:10.5281/zenodo.11093417
-
[5]
Mohamed Badran and Marwan Torki. 2023. DAUT: Underwater Image En- hancement Using Depth Aware U-shape Transformer. In2023 IEEE International Spatiotemporal Degradation-Aware 3D Gaussian Splatting for Realistic Underwater Scene Reconstruction Conference on Image Processing (ICIP). IEEE, 1830–1834
work page 2023
-
[6]
Dana Berman, Deborah Levy, Shai Avidan, and Tali Treibitz. 2020. Underwater single image color restoration using haze-lines and a new quantitative dataset. IEEE transactions on pattern analysis and machine intelligence43, 8 (2020), 2822– 2837
work page 2020
-
[7]
Blender Online Community. 2024. Blender - a 3D Modelling and Rendering Package. https://www.blender.org
work page 2024
-
[8]
Clémentin Boittiaux, Ricard Marxer, Claire Dune, Aurélien Arnaubec, Maxime Ferrera, and Vincent Hugel. 2024. SUCRe: Leveraging scene structure for un- derwater color restoration. In2024 International Conference on 3D Vision (3DV). IEEE, 1488–1497
work page 2024
-
[9]
Wenbo Chen and Ligang Liu. 2024. Deblur-GS: 3D Gaussian Splatting from Camera Motion Blurred Images.Proceedings of the ACM on Computer Graphics and Interactive Techniques7, 1 (2024), 1–15
work page 2024
-
[10]
Hiba Dahmani, Moussab Bennehar, Nathan Piasco, Luis Roldao, and Dzmitry Tsishkou. 2024. Swag: Splatting in the wild images with appearance-conditioned gaussians. InEuropean Conference on Computer Vision. Springer, 325–340
work page 2024
-
[11]
Paulo LJ Drews, Erickson R Nascimento, Silvia SC Botelho, and Mario Fer- nando Montenegro Campos. 2016. Underwater depth estimation and image restoration based on single images.IEEE computer graphics and applications36, 2 (2016), 24–35
work page 2016
-
[12]
Guangchi Fang and Bing Wang. 2024. Mini-splatting: Representing scenes with a constrained number of gaussians. InEuropean Conference on Computer Vision. Springer, 165–181
work page 2024
-
[13]
Ben Fei, Jingyi Xu, Rui Zhang, Qingyuan Zhou, Weidong Yang, and Ying He. 2024. 3d gaussian splatting as new era: A survey.IEEE Transactions on Visualization and Computer Graphics(2024)
work page 2024
-
[14]
Timothy Forbes, Mark Goldsmith, Sudhir Mudur, and Charalambos Poullis. 2018. DeepCaustics: Classification and removal of caustics from underwater imagery. IEEE Journal of Oceanic Engineering44, 3 (2018), 728–738
work page 2018
-
[15]
Rafael Garcia, Tudor Nicosevici, and Xevi Cufí. 2002. On the way to solve lighting problems in underwater imaging. InOCEANS’02 MTS/IEEE, Vol. 2. IEEE, 1018–1024
work page 2002
-
[16]
Yuanhao Gong. 2024. Eggs: Edge guided gaussian splatting for radiance fields. InProceedings of the 29th International ACM Conference on 3D Web Technology. 1–5
work page 2024
-
[17]
Nuno Gracias, Shahriar Negahdaripour, Laszlo Neumann, Ricard Prados, and Rafael Garcia. 2008. A motion compensated filtering approach to remove sunlight flicker in shallow water images. InOCEANS 2008. IEEE, 1–7
work page 2008
-
[18]
Junha Hyung, Susung Hong, Sungwon Hwang, Jaeseong Lee, Jaegul Choo, and Jin-Hwa Kim. 2024. Effective rank analysis and regularization for enhanced 3d gaussian splatting.Advances in Neural Information Processing Systems37 (2024), 110412–110435
work page 2024
-
[19]
Matthew Johnson-Roberson, Mitch Bryson, Ariell Friedman, Oscar Pizarro, Gian- carlo Troni, Paul Ozog, and Jon C Henderson. 2017. High-resolution underwater robotic vision-based mapping and three-dimensional reconstruction for archae- ology.Journal of Field Robotics34, 4 (2017), 625–643
work page 2017
-
[20]
Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis
-
[21]
3D Gaussian Splatting for Real-Time Radiance Field Rendering.ACM Trans. Graph.42, 4 (2023), 139–1
work page 2023
-
[22]
Byeonghyeon Lee, Howoong Lee, Xiangyu Sun, Usman Ali, and Eunbyung Park
-
[23]
InEuropean Conference on Computer Vision
Deblurring 3d gaussian splatting. InEuropean Conference on Computer Vision. Springer, 127–143
-
[24]
Deborah Levy, Amit Peleg, Naama Pearl, Dan Rosenbaum, Derya Akkaynak, Simon Korman, and Tali Treibitz. 2023. SeaThru-NeRF: Neural radiance fields in scattering media. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 56–65
work page 2023
-
[25]
Chongyi Li, Chunle Guo, Wenqi Ren, Runmin Cong, Junhui Hou, Sam Kwong, and Dacheng Tao. 2019. An underwater image enhancement benchmark dataset and beyond.IEEE transactions on image processing29 (2019), 4376–4389
work page 2019
-
[26]
Huapeng Li, Wenxuan Song, Tianao Xu, Alexandre Elsig, and Jonas Kulhanek. [n. d.]. WaterSplatting: Fast Underwater 3D Scene Reconstruction using Gaussian Splatting. InInternational Conference on 3D Vision 2025
work page 2025
-
[27]
Jiahe Li, Jiawei Zhang, Xiao Bai, Jin Zheng, Xin Ning, Jun Zhou, and Lin Gu
-
[28]
InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Dngaussian: Optimizing sparse-view 3d gaussian radiance fields with global-local depth normalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20775–20785
-
[29]
Jiaqi Lin, Zhihao Li, Xiao Tang, Jianzhuang Liu, Shiyong Liu, Jiayue Liu, Yangdi Lu, Xiaofei Wu, Songcen Xu, Youliang Yan, et al . 2024. Vastgaussian: Vast 3d gaussians for large scene reconstruction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5166–5175
work page 2024
-
[30]
Risheng Liu, Zhiying Jiang, Shuzhou Yang, and Xin Fan. 2022. Twin adversarial contrastive learning for underwater image enhancement and beyond.IEEE Transactions on Image Processing31 (2022), 4922–4936
work page 2022
-
[31]
Yang Liu, Chuanchen Luo, Lue Fan, Naiyan Wang, Junran Peng, and Zhaoxiang Zhang. 2024. Citygaussian: Real-time high-quality large-scale scene rendering with gaussians. InEuropean Conference on Computer Vision. Springer, 265–282
work page 2024
-
[32]
Gerard Llorach-Tó, Enoc Martínez, Joaquín Del Río Fernández, and Emilio García- Ladona. 2023. Experience OBSEA: a web-based 3D virtual environment of a seafloor observatory. InOCEANS 2023-Limerick. IEEE, 1–6
work page 2023
-
[33]
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2021. Nerf: Representing scenes as neural radiance fields for view synthesis.Commun. ACM65, 1 (2021), 99–106
work page 2021
-
[34]
Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. 2022. In- stant neural graphics primitives with a multiresolution hash encoding.ACM transactions on graphics (TOG)41, 4 (2022), 1–15
work page 2022
-
[35]
Zak Murez, Tali Treibitz, Ravi Ramamoorthi, and David Kriegman. 2015. Pho- tometric stereo in a scattering medium. InProceedings of the IEEE international conference on computer vision. 3415–3423
work page 2015
-
[36]
Andrea Ramazzina, Mario Bijelic, Stefanie Walz, Alessandro Sanvito, Dominik Scheuble, and Felix Heide. 2023. Scatternerf: Seeing through fog with physically- based inverse neural rendering. InProceedings of the IEEE/CVF International Conference on Computer Vision. 17957–17968
work page 2023
-
[37]
2023.Flsea: Underwater visual-inertial and stereo-vision forward- looking datasets
Yelena Randall. 2023.Flsea: Underwater visual-inertial and stereo-vision forward- looking datasets. Master’s thesis. University of Haifa (Israel)
work page 2023
-
[38]
Jonathan Sauder and Devis Tuia. 2024. Self-Supervised Underwater Caustics Removal and Descattering via Deep Monocular SLAM. InEuropean Conference on Computer Vision. Springer, 214–232
work page 2024
-
[39]
Johannes L Schonberger and Jan-Michael Frahm. 2016. Structure-from-motion revisited. InProceedings of the IEEE conference on computer vision and pattern recognition. 4104–4113
work page 2016
-
[40]
Gaurav Sharma, Wencheng Wu, and Edul N Dalal. 2005. The CIEDE2000 color- difference formula: Implementation notes, supplementary test data, and mathe- matical observations.COLOR research & application30, 1 (2005), 21–30
work page 2005
-
[41]
Tianqi Shen, Shaohua Liu, Jiaqi Feng, Ziye Ma, and Ning An. 2025. Topology- Aware 3D Gaussian Splatting: Leveraging Persistent Homology for Optimized Structural Integrity. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 6823–6832
work page 2025
-
[42]
ASM Shihavuddin, Nuno Gracias, and Rafael Garcia. 2012. Online Sunflicker Removal using Dynamic Texture Prediction.. InVISAPP (1). 161–167
work page 2012
-
[43]
Yunkai Tang, Chengxuan Zhu, Renjie Wan, Chao Xu, and Boxin Shi. 2024. Neural Underwater Scene Representation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11780–11789
work page 2024
-
[44]
Zhengyong Wang, Liquan Shen, Mai Xu, Mei Yu, Kun Wang, and Yufei Lin. 2023. Domain adaptation for underwater image enhancement.IEEE Transactions on Image Processing32 (2023), 1442–1457
work page 2023
-
[45]
Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, and Xinggang Wang. 2024. 4d gaussian splatting for real- time dynamic scene rendering. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20310–20320
work page 2024
- [46]
- [47]
-
[48]
Jiacong Xu, Yiqun Mei, and Vishal Patel. 2024. Wild-gs: Real-time novel view synthesis from unconstrained photo collections.Advances in Neural Information Processing Systems37 (2024), 103334–103355
work page 2024
-
[49]
Xinwei Xue, Tianjiao Ma, Yidong Han, Long Ma, and Risheng Liu. 2023. Learning Deep Scene Curve for Fast and Robust Underwater Image Enhancement.IEEE Signal Processing Letters(2023)
work page 2023
- [50]
-
[51]
Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, and Heng- shuang Zhao. 2024. Depth anything: Unleashing the power of large-scale unla- beled data. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10371–10381
work page 2024
-
[52]
Ziyi Yang, Xinyu Gao, Wen Zhou, Shaohui Jiao, Yuqing Zhang, and Xiaogang Jin. 2024. Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20331–20341
work page 2024
-
[53]
Vickie Ye, Ruilong Li, Justin Kerr, Matias Turkulainen, Brent Yi, Zhuoyang Pan, Otto Seiskari, Jianbo Ye, Jeffrey Hu, Matthew Tancik, et al . 2025. gsplat: An open-source library for Gaussian splatting.Journal of Machine Learning Research 26, 34 (2025), 1–17
work page 2025
-
[54]
Zongxin Ye, Wenyu Li, Sidun Liu, Peng Qiao, and Yong Dou. 2024. Absgs: Recovering fine details in 3d gaussian splatting. InProceedings of the 32nd ACM International Conference on Multimedia. 1053–1061
work page 2024
-
[55]
Matan Yuval and Tali Treibitz. 2024. Releasing a dataset of 3D models of artificial reefs from the northern red-sea for 3D printing and virtual reality applications. Remote Sensing Applications: Society and Environment36 (2024), 101305
work page 2024
-
[56]
Dongbin Zhang, Chuming Wang, Weitao Wang, Peihao Li, Minghan Qin, and Haoqian Wang. 2024. Gaussian in the wild: 3d gaussian splatting for uncon- strained image collections. InEuropean Conference on Computer Vision. Springer, Shaohua Liu et al. 341–359
work page 2024
-
[57]
Jiahui Zhang, Fangneng Zhan, Muyu Xu, Shijian Lu, and Eric Xing. 2024. Fregs: 3d gaussian splatting with progressive frequency regularization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 21424– 21433
work page 2024
-
[58]
Tianyi Zhang and Matthew Johnson-Roberson. 2023. Beyond nerf underwater: Learning neural reflectance fields for true color correction of marine imagery. IEEE Robotics and Automation Letters(2023)
work page 2023
-
[59]
Tianyi Zhang, Weiming Zhi, Braden Meyers, Nelson Durrant, Kaining Huang, Joshua Mangelson, Corina Barbalata, and Matthew Johnson-Roberson. 2024. Recgs: Removing water caustic with recurrent gaussian splatting.IEEE Robotics and Automation Letters(2024)
work page 2024
-
[60]
Zheng Zhang, Wenbo Hu, Yixing Lao, Tong He, and Hengshuang Zhao. 2024. Pixel-gs: Density control with pixel-aware gradient for 3d gaussian splatting. In European Conference on Computer Vision. Springer, 326–342
work page 2024
-
[61]
Jiajia Zhou, Junbin Zhuang, Yan Zheng, Yasheng Chang, and Suleman Mazhar
-
[62]
HIFI-Net: A Novel Network for Enhancement to Underwater Optical Images.IEEE Signal Processing Letters31 (2024), 885–889
work page 2024
-
[63]
Zehao Zhu, Zhiwen Fan, Yifan Jiang, and Zhangyang Wang. 2025. Fsgs: Real-time few-shot view synthesis using gaussian splatting. InEuropean Conference on Computer Vision. Springer, 145–163. Spatiotemporal Degradation-Aware 3D Gaussian Splatting for Realistic Underwater Scene Reconstruction A Dataset Details A.1 Simulated Dataset Since real degradation-free...
work page 2025
-
[64]
Each level outputs 2 features, yielding a compact yet expressive position descriptor. Color Encoder 𝜔(·) .The Color Encoder 𝜔(·) is a lightweight two-layer MLP with ReLU activations and a hidden width of 16. The output is a 32-dimensional color embedding, providing a learned feature representation for color-based modulation. Illumination Perturbation Deco...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.