Diving Deeper into Underwater Image Enhancement: A Survey
Pith reviewed 2026-05-24 20:42 UTC · model grok-4.3
The pith
This survey reviews deep learning methods for underwater image enhancement and benchmarks them on diverse datasets.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes a reference by first presenting underwater image formation models, then surveying deep enhancement networks with attention to their architecture, parameters, training data, loss functions and configurations, summarizing datasets and metrics, running controlled comparisons of the algorithms, identifying shortcomings in existing benchmarks, and outlining unsolved open issues together with suggested research directions.
What carries the argument
The two-fold structure of a comprehensive review of deep networks plus a controlled experimental comparison that serves as a benchmark for the field.
If this is right
- Researchers can consult the summarized network details and training setups when designing new models.
- The identified shortcomings in datasets and metrics indicate concrete targets for improvement.
- The listed open issues provide explicit directions that subsequent papers can address.
- The benchmark comparison supplies a practical starting point for selecting methods on new underwater data.
Where Pith is reading between the lines
- A shared open benchmark built from the surveyed datasets could reduce duplicated experimental effort across labs.
- The image-formation models reviewed here may transfer directly to related degradation problems such as haze or low-light enhancement.
- Releasing code and exact training splits for the benchmark comparisons would allow the community to verify and extend the results.
Load-bearing premise
The chosen deep algorithms, datasets and evaluation protocols are representative of the field and the comparison contains no undisclosed biases in implementation or selection.
What would settle it
A re-run of the same algorithms on the same datasets that produces substantially different performance orderings or conclusions about which methods are strongest would undermine the benchmark.
read the original abstract
The powerful representation capacity of deep learning has made it inevitable for the underwater image enhancement community to employ its potential. The exploration of deep underwater image enhancement networks is increasing over time, and hence; a comprehensive survey is the need of the hour. In this paper, our main aim is two-fold, 1): to provide a comprehensive and in-depth survey of the deep learning-based underwater image enhancement, which covers various perspectives ranging from algorithms to open issues, and 2): to conduct a qualitative and quantitative comparison of the deep algorithms on diverse datasets to serve as a benchmark, which has been barely explored before. To be specific, we first introduce the underwater image formation models, which are the base of training data synthesis and design of deep networks, and also helpful for understanding the process of underwater image degradation. Then, we review deep underwater image enhancement algorithms, and a glimpse of some of the aspects of the current networks is presented including network architecture, network parameters, training data, loss function, and training configurations. We also summarize the evaluation metrics and underwater image datasets. Following that, a systematically experimental comparison is carried out to analyze the robustness and effectiveness of deep algorithms. Meanwhile, we point out the shortcomings of current benchmark datasets and evaluation metrics. Finally, we discuss several unsolved open issues and suggest possible research directions. We hope that all efforts done in this paper might serve as a comprehensive reference for future research and call for the development of deep learning-based underwater image enhancement.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript surveys deep learning-based underwater image enhancement, first reviewing underwater image formation models, then covering network architectures, parameters, training data, loss functions, and configurations of existing methods. It summarizes evaluation metrics and datasets, performs a systematic qualitative and quantitative comparison of selected deep algorithms across diverse datasets to establish a benchmark, identifies shortcomings in current datasets and metrics, and discusses open issues with suggested research directions.
Significance. If the benchmark comparison is representative and free of undisclosed selection or implementation biases, the work would provide a consolidated reference for the underwater image enhancement community, filling a gap by offering both an in-depth review and the first systematic experimental benchmark of deep methods, which could help standardize evaluation and highlight directions for future work.
major comments (2)
- [experimental comparison section (following the review of algorithms and metrics)] The experimental comparison section does not state explicit inclusion criteria for selecting the deep algorithms, datasets, or evaluation protocols used in the benchmark. This is load-bearing for the central claim that the comparison 'serves as a benchmark' because representativeness cannot be assessed without knowing the selection process or whether the chosen methods cover the range of architectures, losses, and training regimes reviewed earlier in the survey.
- [section describing the systematically experimental comparison] It is not specified whether the quantitative results were obtained by re-implementing all reviewed networks under identical hyperparameters and training configurations or by using author-provided code with potentially varying setups. This directly affects the validity of the robustness and effectiveness claims in the benchmark, as inconsistent implementation details could introduce biases not disclosed in the protocol.
minor comments (2)
- [abstract] The abstract contains a minor grammatical issue ('hence; a comprehensive survey') that should be corrected for clarity.
- [figures and tables in the experimental section] Some figure captions or table descriptions could more explicitly link back to the specific algorithms and datasets discussed in the review sections to improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address the two major comments on the experimental comparison section below and will revise the manuscript to provide the requested details.
read point-by-point responses
-
Referee: The experimental comparison section does not state explicit inclusion criteria for selecting the deep algorithms, datasets, or evaluation protocols used in the benchmark. This is load-bearing for the central claim that the comparison 'serves as a benchmark' because representativeness cannot be assessed without knowing the selection process or whether the chosen methods cover the range of architectures, losses, and training regimes reviewed earlier in the survey.
Authors: We agree that explicit inclusion criteria are not stated in the current manuscript. In the revision we will insert a dedicated paragraph (or subsection) immediately preceding the experimental results that specifies: (i) algorithm selection was limited to methods with publicly released code or sufficiently detailed architectures/losses to permit faithful re-implementation, chosen to span the main families reviewed earlier (CNN encoder-decoder, GAN-based, attention-augmented, etc.); (ii) datasets comprise the most widely adopted benchmarks containing both synthetic and real underwater images; (iii) evaluation protocols follow the standard metrics and train/test splits reported in the original papers. This addition will allow readers to judge representativeness directly. revision: yes
-
Referee: It is not specified whether the quantitative results were obtained by re-implementing all reviewed networks under identical hyperparameters and training configurations or by using author-provided code with potentially varying setups. This directly affects the validity of the robustness and effectiveness claims in the benchmark, as inconsistent implementation details could introduce biases not disclosed in the protocol.
Authors: We acknowledge the omission. The revised manuscript will explicitly describe the protocol: author-provided implementations were used when available and run with their original hyper-parameters; remaining networks were re-implemented from the papers and trained under a common set of settings (optimizer, learning-rate schedule, batch size, and hardware) chosen to be as close as possible to the originals while ensuring comparability. A supplementary table will list per-method training configurations and any necessary adaptations. This clarification will be added to the experimental section. revision: yes
Circularity Check
No circularity: survey and benchmark without derivation chain
full rationale
This is a survey paper whose central claims are coverage of existing algorithms, datasets, metrics, and a comparative benchmark experiment. No mathematical derivations, first-principles predictions, or fitted parameters are presented that could reduce to their own inputs by construction. The two-fold aim (comprehensive review plus experimental comparison) rests on selection and implementation choices rather than self-definitional equations or self-citation chains that substitute for independent evidence. The paper therefore contains no load-bearing steps matching any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Akkaynak, D., Treibitz, T.: A revised underwater image formation model. In: CVPR (2018)
work page 2018
-
[2]
Densely Residual Laplacian Super-Resolution
Anwar, S., Barnes, N.: Densely residual laplacian super- resolution. arXiv preprint arXiv:1906.12021 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 1906
-
[3]
Deep Underwater Image Enhancement
Anwar, S., Li, C., Porikli, F.: Deep underwater image enhancement. arXiv preprint arXiv:1807.03528 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[4]
Aucuti, C., Ancuti, C.O., Bekaert, P.: Enhancing under- water images and videos by fusion. In: CVPR (2012)
work page 2012
-
[5]
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for im- age segmentation. TPAMI (2017)
work page 2017
-
[6]
Underwater Single Image Color Restoration Using Haze-Lines and a New Quantitative Dataset
Berman, D., Levy, D., Avidan, S., Treibitz, T.: Un- derwater single image color restoration using haze- lines and a new quantitative dataset. arXiv preprint arXiv:1811.01343 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[7]
In: Eco- logical Informatics (2014)
Boom, B.J., He, J., Palazzo, S., Huang, P.X., Chou, H.M., Lin, F.P., Spampinato, C., Fisher, R.B.: A research tool for long-term and continuous analysis of fish assemblage in coral-reefs using underwater camera footage. In: Eco- logical Informatics (2014)
work page 2014
-
[8]
Cao, K., Peng, Y.T., Cosman, P.C.: Underwater image restoration using deep networks to estimate background light and scene depth. In: SSIAI (2018)
work page 2018
-
[9]
IEEE Transac- tions on Image Processing 21(4), 1756–1769 (2012)
Chiang, J., Chen, Y.: Underwater image enhancement by wavelength compensation and dehazing. IEEE Transac- tions on Image Processing 21(4), 1756–1769 (2012)
work page 2012
-
[10]
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: CVPR (2009)
work page 2009
-
[11]
Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: NIPS (2014)
work page 2014
-
[12]
Enhancing Underwater Imagery using Generative Adversarial Networks
Fabbri, C., Islam, M.J., Sattar, J.: Enhancing underwa- ter imagery using generative adversarial networks. arXiv preprint arXiv:1801.04011 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
- [13]
-
[14]
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein gans. In: NIPS (2017)
work page 2017
-
[15]
Guo, C., Li, C., Guo, J., etal: Hierarchical features driven residual learning for depth map super-resolution. TIP (2018)
work page 2018
-
[16]
Guo, Y., Li, H., Zhuang, P.: Underwater image enhance- ment using a multiscale dense generative adversarial net- work. IEEE J. Oceanic. Eng. (2019)
work page 2019
-
[17]
TPAMI (2011) 20 Saeed Anwar∗, Chongyi Li ∗
He, K., Sun, J., Tang, X.: Single image haze removal using dark channel prior. TPAMI (2011) 20 Saeed Anwar∗, Chongyi Li ∗
work page 2011
- [18]
-
[19]
He, K., Zhang, X., Ren, S., etal: Deep residual learniing for image recognition. In: CVPR (2016)
work page 2016
-
[20]
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
work page 2016
-
[21]
Hou, M., Liu, R., Fan, X., Luo, Z.: Joint residual learning for underwater image enhancement. In: ICIP (2018)
work page 2018
-
[22]
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift (2015)
work page 2015
-
[23]
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to- image translation with conditional adversarial networks. In: CVPR (2017)
work page 2017
-
[24]
In: Consumer depth cameras for computer vision (2013)
Janoch, A., Karayev, S., Jia, Y., Barron, J.T., Fritz, M., Saenko, K., Darrell, T.: A category-level 3d object dataset: Putting the kinect to work. In: Consumer depth cameras for computer vision (2013)
work page 2013
-
[26]
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: ECCV (2016)
work page 2016
-
[27]
The relativistic discriminator: a key element missing from standard GAN
Jolicoeur-Martineau, A.: The relativistic discriminator: a key element missing from standard gan. arXiv preprint arXiv:1807.00734 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[28]
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. ICLR (2014)
work page 2014
-
[29]
Beitrage zur Physik der freien Atmosphare (1924)
Koschmieder, H.: Theorie der horizontalen sichtweite. Beitrage zur Physik der freien Atmosphare (1924)
work page 1924
-
[30]
Lai, K., Bo, L., Fox, D.: Unsupervised feature learning for 3d scene labeling. In: ICRA (2014)
work page 2014
-
[31]
LeCun, Y., Bengio, Y., Hinton, G.: Nature (2015)
work page 2015
-
[32]
Pro- ceedings of the IEEE (1998)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient- based learning applied to document recognition. Pro- ceedings of the IEEE (1998)
work page 1998
-
[33]
Ledig, C., Wang, Z., Shi, W., Theis, L., Huszar, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., et al.: Photo-realistic single image super- resolution using a generative adversarial network. In: CVPR (2017)
work page 2017
-
[34]
IEEE Signal Processing Letters (2018)
Li, C., Guo, C., Guo, J.: Emerging from water: Under- water image color correction based on weakly supervised color transfer. IEEE Signal Processing Letters (2018)
work page 2018
-
[35]
arXiv preprint arXiv:1901.05495 (2019)
Li, C., Guo, C., Ren, W., Cong, R., Hou, J., Kwong, S.: An underwater image enhancement dataset and beyond. arXiv preprint arXiv:1901.05495 (2019)
-
[36]
Li, C., Wand, M.: Precomputed real-time texture synthe- sis with markovian generative adversarial networks. In: ECCV (2016)
work page 2016
-
[37]
A Fusion Adversarial Underwater Image Enhancement Network with a Public Test Dataset
Li, H., Li, J., Wang, W.: A Fusion Adversarial Under- water Image Enhancement Network with a Public Test Dataset. arXiv e-prints arXiv:1906.06819 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 1906
-
[38]
IEEE Robotics and Automation Letters (2018)
Li, J., Skinner, K.A., Eustice, R.M., Johnson-Roberson, M.: Watergan: Unsupervised generative network to en- able real-time color correction of monocular underwater images. IEEE Robotics and Automation Letters (2018)
work page 2018
-
[39]
Optics & Laser Technology (2019)
Lu, J., Li, N., Zhang, S., Yu, Z., Zheng, H., Zheng, B.: Multi-scale adversarial network for underwater image restoration. Optics & Laser Technology (2019)
work page 2019
-
[40]
Mao, X., Shen, C., Yang, Y.B.: Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In: NIPS (2016)
work page 2016
-
[41]
Spectral Normalization for Generative Adversarial Networks
Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spec- tral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[42]
Nair, V., Hinton, G.E.: Rectified linear units improve re- stricted boltzmann machines. In: ICML (2010)
work page 2010
-
[43]
Oleari, F., Kallasi, F., Rizzini, D.L., Aleotti, J., Caselli, S.: An underwater stereo vision system: from design to deployment and dataset acquistion. In: OCEANS (2015)
work page 2015
-
[44]
IEEE Jour- nal of Oceanic Engineering (2015)
Panetta, K., Gao, C., Agaian, S.: Human-visual-system- inspired underwater image quality measures. IEEE Jour- nal of Oceanic Engineering (2015)
work page 2015
-
[45]
Pizarro, O., Friedman, A., Bryson, M., Williams, S.B., Madin, J.: A simple, fast, and repeatable survey method for underwater visual 3d benthic mapping and monitor- ing. Ecology and evolution (2017)
work page 2017
-
[46]
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
Radford, A., Metz, L., Chintala, S.: Unsupervised rep- resentation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[47]
Ren, S., He, K., Girshick, R., etal: Guided image filtering. TPAMI (2017)
work page 2017
-
[48]
In: International Conference on Medical image computing and computer-assisted intervention (2015)
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolu- tional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention (2015)
work page 2015
-
[49]
Shotton, J., Glocker, B., Zach, C., Izadi, S., Criminisi, A., Fitzgibbon, A.: Scene coordinate regression forests for camera relocalization in rgb-d images. In: CVPR (2013)
work page 2013
-
[50]
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from rgbd images. In: ECCV (2012)
work page 2012
-
[51]
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ICLR (2014)
work page 2014
-
[52]
Skinner, K.A., Johnson-Roberson, M.: Underwater image dehazing with a light field camera. In: CVPRW (2017)
work page 2017
-
[53]
Sun, X., Liu, L., Li, Q., Dong, J., Lima, E., Yin, R.: Deep pixel to pixel network for underwater image enhancement and restoration. IET Image Processing (2018)
work page 2018
-
[54]
All-In-One Underwater Image Enhancement using Domain-Adversarial Learning
Uplavikar, P., Wu, Z., Wang, Z.: All-in-one underwater image enhancement using domain-adversarial learning. arXiv preprint arXiv:1905.13342 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 1905
-
[55]
IEEE Signal Processing Letters (2015)
Wang, S., Ma, K., Yeganeh, H., Wang, Z., Lin, W.: A patch-structure representation method for quality assess- ment of contrast changed images. IEEE Signal Processing Letters (2015)
work page 2015
-
[56]
Wang, Y., Zhang, J., Cao, Y., Wang, Z.: A deep cnn method for underwater image enhancement. In: ICIP (2017)
work page 2017
-
[57]
IEEE signal processing letters (2002)
Wang, Z., Bovik, A.C.: A universal image quality index. IEEE signal processing letters (2002)
work page 2002
-
[58]
Synthesis Lectures on Image, Video, and Multimedia Pro- cessing (2006)
Wang, Z., Bovik, A.C.: Modern image quality assessment. Synthesis Lectures on Image, Video, and Multimedia Pro- cessing (2006)
work page 2006
-
[59]
IEEE signal processing magazine (2009)
Wang, Z., Bovik, A.C.: Mean squared error: Love it or leave it? a new look at signal fidelity measures. IEEE signal processing magazine (2009)
work page 2009
-
[60]
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from error visibility to structural similarity. TIP (2004)
work page 2004
-
[61]
In: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003 (2003)
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale struc- tural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003 (2003)
work page 2003
-
[62]
In: Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pp
Wu, Q., Wang, P., Shen, C., Dick, A., van den Hengel, A.: Ask me anything: Free-form visual question answering based on knowledge from external sources. In: Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4622–4630 (2016)
work page 2016
-
[63]
TIP (2015) Diving Deeper into Underwater Image Enhancement: A Survey 21
Yang, M., Sowmya, A.: An underwater color image qual- ity evaluation metric. TIP (2015) Diving Deeper into Underwater Image Enhancement: A Survey 21
work page 2015
-
[64]
In: Pacific Rim Conference on Multimedia (2018)
Ye, X., Xu, H., Ji, X., Xu, R.: Underwater image enhance- ment using stacked generative adversarial networks. In: Pacific Rim Conference on Multimedia (2018)
work page 2018
-
[65]
Zhang, K., Zuo, W., Gu, S.: Learning deep cnn denoiser prior for image restoration. In: CVPR (2017)
work page 2017
-
[66]
Zhu, Y., Park, T., Efros, A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.