Recognition: unknown
Enhancing Hazy Wildlife Imagery: AnimalHaze3k and IncepDehazeGan
Pith reviewed 2026-05-10 08:14 UTC · model grok-4.3
The pith
A new GAN architecture trained on synthetic hazy wildlife photos restores image clarity enough to more than double object detection accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce AnimalHaze3k, a synthetic dataset of 3,477 hazy wildlife images generated from 1,159 clear photographs through a physics-based pipeline, and IncepDehazeGan, a GAN that fuses inception blocks with residual skip connections, which achieves SSIM 0.8914, PSNR 20.54, and LPIPS 0.1104 while lifting YOLOv11 mAP by 112 percent and IoU by 67 percent on the dehazed outputs.
What carries the argument
IncepDehazeGan, a generative adversarial network that inserts inception blocks for multi-scale feature capture and residual skip connections for stable gradient flow during haze removal from wildlife photographs.
If this is right
- Dehazed wildlife images produce substantially higher accuracy in automated animal detection and tracking tasks.
- The synthetic AnimalHaze3k dataset supplies training data for dehazing models when real paired hazy-clear wildlife images are scarce.
- Improved visibility in restored frames supports downstream conservation tasks such as population counts and behavior studies.
- The reported metric gains establish a performance baseline for specialized haze removal in narrow-domain imagery.
Where Pith is reading between the lines
- If the synthetic haze model generalizes, the same architecture could be retrained on hazy images from other domains such as aerial surveys or underwater footage.
- A direct side-by-side test on real hazy field data would reveal whether the reported detection gains hold outside the synthetic distribution.
- Coupling the dehazer with tracking algorithms could reduce identity switches caused by haze-induced appearance changes.
Load-bearing premise
The physics-based pipeline that adds synthetic haze to clear wildlife photographs produces images that match the visual and statistical properties of real atmospheric haze encountered in the field.
What would settle it
Run IncepDehazeGan on a held-out collection of actual field-captured hazy wildlife photographs, measure the resulting SSIM/PSNR/LPIPS against human-labeled clear references, and compare the YOLOv11 mAP and IoU gains to the numbers obtained on the synthetic test set.
Figures
read the original abstract
Atmospheric haze significantly degrades wildlife imagery, impeding computer vision applications critical for conservation, such as animal detection, tracking, and behavior analysis. To address this challenge, we introduce AnimalHaze3k a synthetic dataset comprising of 3,477 hazy images generated from 1,159 clear wildlife photographs through a physics-based pipeline. Our novel IncepDehazeGan architecture combines inception blocks with residual skip connections in a GAN framework, achieving state-of-the-art performance (SSIM: 0.8914, PSNR: 20.54, and LPIPS: 0.1104), delivering 6.27% higher SSIM and 10.2% better PSNR than competing approaches. When applied to downstream detection tasks, dehazed images improved YOLOv11 detection mAP by 112% and IoU by 67%. These advances can provide ecologists with reliable tools for population monitoring and surveillance in challenging environmental conditions, demonstrating significant potential for enhancing wildlife conservation efforts through robust visual analytics.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the AnimalHaze3k synthetic dataset of 3,477 hazy wildlife images generated from 1,159 clear photographs via a physics-based atmospheric scattering pipeline. It proposes IncepDehazeGan, a GAN that integrates inception blocks with residual skip connections, and reports state-of-the-art dehazing metrics on this dataset (SSIM 0.8914, PSNR 20.54, LPIPS 0.1104) along with 6.27% SSIM and 10.2% PSNR gains over prior methods. The work further claims that applying the dehazer to YOLOv11 detection yields 112% higher mAP and 67% higher IoU, positioning the contributions as aids for wildlife conservation in hazy conditions.
Significance. If the synthetic haze model generalizes, the domain-specific dataset and tailored GAN architecture would constitute a useful contribution to hazy-image enhancement for wildlife monitoring, potentially supporting downstream tasks like animal detection in conservation settings. The explicit focus on a wildlife domain and the reported downstream detection gains distinguish the work from generic dehazing papers, though the absence of real-field validation constrains the assessed impact.
major comments (3)
- [Abstract and Results] Abstract and Results section: the claimed 112% mAP and 67% IoU improvements for YOLOv11 are presented without baseline mAP or IoU values on the original hazy images and without any description of how the percentages were computed, rendering the magnitude of the downstream benefit unverifiable.
- [Dataset generation and Experiments] Dataset generation and Experiments sections: all quantitative results (SSIM, PSNR, LPIPS, and detection metrics) are obtained exclusively on the synthetic AnimalHaze3k data; no quantitative or qualitative evaluation on authentic hazy field imagery is reported, leaving the central claim of utility for real-world wildlife conservation untested.
- [Methodology] Methodology section: the paper introduces IncepDehazeGan as a novel combination of inception blocks and residual skip connections inside a GAN but provides no ablation studies isolating the contribution of each component, so the architectural rationale for the reported gains cannot be assessed.
minor comments (2)
- [Abstract] The abstract asserts 'state-of-the-art' performance but does not tabulate the exact scores of the competing methods being compared.
- [Dataset] Training/validation/test split sizes and the precise atmospheric scattering parameters used in the physics-based pipeline are not stated in the provided summary.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. We address each major comment point by point below, indicating the revisions we will incorporate where feasible.
read point-by-point responses
-
Referee: [Abstract and Results] Abstract and Results section: the claimed 112% mAP and 67% IoU improvements for YOLOv11 are presented without baseline mAP or IoU values on the original hazy images and without any description of how the percentages were computed, rendering the magnitude of the downstream benefit unverifiable.
Authors: We agree that the baseline mAP and IoU values on the original hazy images, along with the exact computation method, should have been provided to make the reported gains verifiable. In the revised manuscript, we will include these baseline values for YOLOv11 on the hazy images and explicitly describe the percentage improvement formula as ((dehazed_metric - hazy_metric) / hazy_metric) * 100%. revision: yes
-
Referee: [Dataset generation and Experiments] Dataset generation and Experiments sections: all quantitative results (SSIM, PSNR, LPIPS, and detection metrics) are obtained exclusively on the synthetic AnimalHaze3k data; no quantitative or qualitative evaluation on authentic hazy field imagery is reported, leaving the central claim of utility for real-world wildlife conservation untested.
Authors: Our study is designed around the synthetic AnimalHaze3k dataset to enable precise quantitative evaluation using known ground-truth clear images. We acknowledge the value of real-world validation but note that paired real hazy and clear wildlife images with accurate annotations are not available to us. revision: partial
-
Referee: [Methodology] Methodology section: the paper introduces IncepDehazeGan as a novel combination of inception blocks and residual skip connections inside a GAN but provides no ablation studies isolating the contribution of each component, so the architectural rationale for the reported gains cannot be assessed.
Authors: We recognize that ablation studies are necessary to isolate the contributions of the inception blocks and residual skip connections. In the revised manuscript, we will add ablation experiments that systematically remove each component and report the resulting changes in SSIM, PSNR, and LPIPS on the AnimalHaze3k test set. revision: yes
- We cannot provide quantitative or qualitative evaluation on authentic hazy field imagery because no such paired real-world dataset with ground truth is available.
Circularity Check
No significant circularity in derivation chain
full rationale
The paper introduces a new synthetic dataset (AnimalHaze3k) generated from clear wildlife photographs via a described physics-based pipeline and presents a novel GAN architecture (IncepDehazeGan) whose performance is evaluated using standard metrics (SSIM, PSNR, LPIPS) plus downstream detection on the same synthetic data. No equations, derivations, or self-referential definitions are provided that would reduce the reported gains to quantities defined by fitted parameters within the paper itself; the results are empirical outputs from training and testing rather than predictions forced by construction. Any self-citations are not load-bearing for the central claims, and the work remains self-contained against external benchmarks without invoking uniqueness theorems or ansatzes from prior author work.
Axiom & Free-Parameter Ledger
free parameters (1)
- Atmospheric scattering parameters
axioms (1)
- domain assumption The atmospheric scattering model accurately represents haze in natural wildlife scenes
Reference graph
Works this paper leans on
-
[1]
ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth
Shariq Farooq Bhat, Reiner Birkl, Diana Wofk, Peter Wonka, and Matthias M ¨uller. Zoedepth: Zero-shot trans- fer by combining relative and metric depth.arXiv preprint arXiv:2302.12288, 2023. 2
work page internal anchor Pith review arXiv 2023
-
[2]
Dehazenet: An end-to-end system for single image haze removal.IEEE transactions on image process- ing, 25(11):5187–5198, 2016
Bolun Cai, Xiangmin Xu, Kui Jia, Chunmei Qing, and Dacheng Tao. Dehazenet: An end-to-end system for single image haze removal.IEEE transactions on image process- ing, 25(11):5187–5198, 2016. 1
2016
-
[3]
Carl Chalmers, Paul Fergus, Serge Wich, and Aday Curbelo Montanez. Conservation ai: Live stream analysis for the detection of endangered species using convolutional neural networks and drone technology.arXiv preprint arXiv:1910.07360, 2019. 1
-
[4]
Dea-net: Single image dehazing based on detail-enhanced convolution and content-guided attention, 2023
Zixuan Chen, Zewei He, and Zhe-Ming Lu. Dea-net: Single image dehazing based on detail-enhanced convolution and content-guided attention, 2023. 3
2023
-
[5]
Multi-scale boosted de- hazing network with dense feature fusion
Hang Dong, Jinshan Pan, Lei Xiang, Zhe Hu, Xinyi Zhang, Fei Wang, and Ming-Hsuan Yang. Multi-scale boosted de- hazing network with dense feature fusion. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2157–2167, 2020. 1
2020
-
[6]
Physics-based feature de- hazing networks
Jiangxin Dong and Jinshan Pan. Physics-based feature de- hazing networks. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXX 16, pages 188–204. Springer, 2020. 1
2020
-
[7]
Yu Dong, Yihao Liu, He Zhang, Shifeng Chen, and Yu Qiao. FD-GAN: generative adversarial networks with fusion-discriminator for single image dehazing.CoRR, abs/2001.06968, 2020. 3
-
[8]
Dehazing network: Asymmetric unet based on physical model.IEEE Transactions on Geoscience and Remote Sensing, 62:1–12, 2024
Yang Du, Jun Li, Qinghong Sheng, Yuxin Zhu, Bo Wang, and Xiao Ling. Dehazing network: Asymmetric unet based on physical model.IEEE Transactions on Geoscience and Remote Sensing, 62:1–12, 2024. 3
2024
-
[9]
Ashkan Ganj, Hang Su, and Tian Guo. Hybriddepth: Ro- bust metric depth fusion by leveraging depth from focus and single-image priors.arXiv preprint arXiv:2407.18443, 2024. 2
-
[10]
Face recognition of a lorisidae species based on computer vision.Global Ecology and Conservation, 45: e02511, 2023
Yan Guan, Yujie Lei, Yuhui Zhu, Tingxuan Li, Ying Xiang, Pengmei Dong, Rong Jiang, Jinwen Luo, Anqi Huang, Yu- mai Fan, et al. Face recognition of a lorisidae species based on computer vision.Global Ecology and Conservation, 45: e02511, 2023. 1
2023
-
[11]
Alain Hor ´e and Djemel Ziou. Image quality metrics: Psnr (a) Hazy (b) FFA (c) FD-GAN (d) DEANET (e) DehazeFormer (f) IncepDehaze (g) Ground Truth (h) Hazy (i) FFA (j) FD-GAN (k) DEANET (l) DehazeFormer (m) IncepDehaze (n) Ground Truth Figure 5. Qualitative comparison on dehazing results across SOTA models and IncepDehazeGan (our model). (a) (b) (c) (d) (...
2010
-
[12]
Identification of animal individuals using deep learning: A case study of giant panda.Biological Conservation, 242:108414, 2020
Jin Hou, Yuxin He, Hongbo Yang, Thomas Connor, Jie Gao, Yujun Wang, Yichao Zeng, Jindong Zhang, Jinyan Huang, Bochuan Zheng, et al. Identification of animal individuals using deep learning: A case study of giant panda.Biological Conservation, 242:108414, 2020. 1
2020
-
[13]
Image-to-image translation with conditional adver- sarial networks
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation with conditional adver- sarial networks. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134,
-
[14]
Yolov11: An overview of the key architectural enhancements, 2024
Rahima Khanam and Muhammad Hussain. Yolov11: An overview of the key architectural enhancements, 2024. 3
2024
-
[15]
Benchmarking single- image dehazing and beyond.IEEE Transactions on Image Processing, 28(1):492–505, 2018
Boyi Li, Wenqi Ren, Dengpan Fu, Dacheng Tao, Dan Feng, Wenjun Zeng, and Zhangyang Wang. Benchmarking single- image dehazing and beyond.IEEE Transactions on Image Processing, 28(1):492–505, 2018. 2
2018
-
[16]
Multi-layers feature fusion of convolutional neural network for scene classification of remote sensing.IEEE Access, 7:121685– 121694, 2019
Chenhui Ma, Xiaodong Mu, and Dexuan Sha. Multi-layers feature fusion of convolutional neural network for scene classification of remote sensing.IEEE Access, 7:121685– 121694, 2019. 3
2019
-
[17]
Vision and the atmosphere.International journal of computer vision, 48: 233–254, 2002
Srinivasa G Narasimhan and Shree K Nayar. Vision and the atmosphere.International journal of computer vision, 48: 233–254, 2002. 1
2002
-
[18]
Interactive (de) weathering of an image using physical models
Srinivasa G Narasimhan and Shree K Nayar. Interactive (de) weathering of an image using physical models. InIEEE Workshop on color and photometric Methods in computer Vision, page 1. France, 2003. 1
2003
-
[19]
Limitations of recreational camera traps for wildlife manage- ment and conservation research: A practitioner’s perspec- tive.Ambio, 44:624–635, 2015
Scott Newey, Paul Davidson, Sajid Nazir, Gorry Fairhurst, Fabio Verdicchio, R Justin Irvine, and Ren ´e Van Der Wal. Limitations of recreational camera traps for wildlife manage- ment and conservation research: A practitioner’s perspec- tive.Ambio, 44:624–635, 2015. 1
2015
-
[20]
Mohammad Sadegh Norouzzadeh, Anh Nguyen, Margaret Kosmala, Alexandra Swanson, Meredith S Palmer, Craig Packer, and Jeff Clune. Automatically identifying, count- ing, and describing wild animals in camera-trap images with deep learning.Proceedings of the National Academy of Sci- ences, 115(25):E5716–E5725, 2018. 1
2018
-
[21]
Camera traps in animal ecology: methods and analyses
Allan F O’Connell, James D Nichols, and K Ullas Karanth. Camera traps in animal ecology: methods and analyses. Springer, 2011. 1
2011
-
[22]
Ffa-net: Feature fusion attention network for single image dehazing.CoRR, abs/1911.07559, 2019
Xu Qin, Zhilin Wang, Yuanchao Bai, Xiaodong Xie, and Huizhu Jia. Ffa-net: Feature fusion attention network for single image dehazing.CoRR, abs/1911.07559, 2019. 3
-
[23]
Tracking and analysing social interactions in dairy cattle with real-time locating system and machine learning.Jour- nal of Systems Architecture, 116:102139, 2021
Keni Ren, Gun Bernes, M ˚arten Hetta, and Johannes Karls- son. Tracking and analysing social interactions in dairy cattle with real-time locating system and machine learning.Jour- nal of Systems Architecture, 116:102139, 2021. 1
2021
-
[24]
Single image dehazing via multi- scale convolutional neural networks
Wenqi Ren, Si Liu, Hua Zhang, Jinshan Pan, Xiaochun Cao, and Ming-Hsuan Yang. Single image dehazing via multi- scale convolutional neural networks. InComputer Vision– ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, pages 154–169. Springer, 2016. 1
2016
-
[25]
Hubot: A biomimicking mobile robot for non- disruptive bird behavior study.Ecological Informatics, 85: 102939, 2025
Lyes Saad Saoud, Lo ¨ıc Lesobre, Enrico Sorato, Saud Al Qaydi, Yves Hingrat, Lakmal Seneviratne, and Irfan Hussain. Hubot: A biomimicking mobile robot for non- disruptive bird behavior study.Ecological Informatics, 85: 102939, 2025. 1
2025
-
[26]
Identification of an- imals and recognition of their actions in wildlife videos us- ing deep learning techniques.Ecological Informatics, 61: 101215, 2021
Frank Schindler and V olker Steinhage. Identification of an- imals and recognition of their actions in wildlife videos us- ing deep learning techniques.Ecological Informatics, 61: 101215, 2021. 1
2021
-
[27]
A comprehensive review of computational dehazing techniques.Archives of Computa- tional Methods in Engineering, 26(5):1395–1413, 2019
Dilbag Singh and Vijay Kumar. A comprehensive review of computational dehazing techniques.Archives of Computa- tional Methods in Engineering, 26(5):1395–1413, 2019. 1
2019
-
[28]
Vision transformers for single image dehazing.IEEE Transactions on Image Processing, 32:1927–1941, 2023
Yuda Song, Zhuqing He, Hui Qian, and Xin Du. Vision transformers for single image dehazing.IEEE Transactions on Image Processing, 32:1927–1941, 2023. 2, 3
1927
-
[29]
Going Deeper with Convolutions
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions.CoRR, abs/1409.4842, 2014. 3
work page Pith review arXiv 2014
-
[30]
An- imal detection and classification from camera trap images us- ing different mainstream object detection architectures.Ani- mals, 12(15):1976, 2022
Mengyu Tan, Wentao Chao, Jo-Ku Cheng, Mo Zhou, Yiwen Ma, Xinyi Jiang, Jianping Ge, Lian Yu, and Limin Feng. An- imal detection and classification from camera trap images us- ing different mainstream object detection architectures.Ani- mals, 12(15):1976, 2022. 2
1976
-
[31]
Attention is all you need.Advances in neural information processing systems, 30, 2017
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017. 2
2017
-
[32]
Recent advances in im- age dehazing.IEEE/CAA Journal of Automatica Sinica, 4 (3):410–436, 2017
Wencheng Wang and Xiaohui Yuan. Recent advances in im- age dehazing.IEEE/CAA Journal of Automatica Sinica, 4 (3):410–436, 2017. 1
2017
-
[33]
Bovik, H.R
Zhou Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. Image quality assessment: from error visibility to structural similarity.IEEE Transactions on Image Processing, 13(4): 600–612, 2004. 3
2004
-
[34]
Con- trastive learning for compact single image dehazing
Haiyan Wu, Yanyun Qu, Shaohui Lin, Jian Zhou, Ruizhi Qiao, Zhizhong Zhang, Yuan Xie, and Lizhuang Ma. Con- trastive learning for compact single image dehazing. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10551–10560, 2021. 1
2021
-
[35]
Deep depth from focus with differential focus volume
Fengting Yang, Xiaolei Huang, and Zihan Zhou. Deep depth from focus with differential focus volume. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12642–12651, 2022. 2
2022
-
[36]
Depth anything: Unleashing the power of large-scale unlabeled data
Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, and Hengshuang Zhao. Depth anything: Unleashing the power of large-scale unlabeled data. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10371–10381, 2024. 2
2024
-
[37]
Densely connected pyramid dehazing network
He Zhang and Vishal M Patel. Densely connected pyramid dehazing network. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 3194–3203,
-
[38]
Fsim: A feature similarity index for image quality assess- ment.IEEE Transactions on Image Processing, 20(8):2378– 2386, 2011
Lin Zhang, Lei Zhang, Xuanqin Mou, and David Zhang. Fsim: A feature similarity index for image quality assess- ment.IEEE Transactions on Image Processing, 20(8):2378– 2386, 2011. 3
2011
-
[39]
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shecht- man, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric.CoRR, abs/1801.03924,
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.