A drone-based framework for coral habitat mapping via weakly supervised segmentation
Pith reviewed 2026-05-25 08:07 UTC · model grok-4.3
The pith
Point classifications from underwater images can train high-resolution coral segmentation models on drone orthophotos without any pixel labels.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By bridging fine-scale multi-label predictions from underwater imagery with broad-coverage aerial data, the method converts point-level classifications into coarse masks that train a semantic segmentation model on UAV orthophotos; a subsequent self-refinement step using the model's own outputs further improves accuracy, yielding 86.07 percent pixel accuracy and 52.23 percent mIoU on annotated reef zones and enabling large-area segmentation of coral morphotypes.
What carries the argument
Multi-scale weakly supervised semantic segmentation pipeline that turns point classifications into coarse masks for UAV orthophoto training followed by self-refinement.
If this is right
- Large-area coral habitat maps become feasible without pixel-level annotation budgets.
- New coral morphotype classes can be added by supplying only point classifications.
- Segmentation models can be trained from mixed underwater and aerial sources at different resolutions.
- Self-refinement after initial coarse-mask training measurably raises spatial accuracy.
Where Pith is reading between the lines
- The same coarse-to-fine pipeline could be tested on other benthic habitats such as seagrass beds or kelp forests.
- If point density varies across surveys, performance sensitivity to that density becomes a direct next measurement.
- Combining the trained model with repeated drone flights would allow change detection over time without new labels.
- The method's reliance on cross-modal alignment suggests a natural extension to satellite imagery once coarse masks are available.
Load-bearing premise
The coarse masks created from point-level underwater classifications are accurate enough and spatially aligned with the UAV orthophotos to serve as usable training targets.
What would settle it
Train the model on the generated coarse masks and test it on a new set of manually pixel-annotated reef zones; if pixel accuracy falls near chance level or mIoU stays below 20 percent, the core claim does not hold.
Figures
read the original abstract
Obtaining pixel-level annotations over large spatial extents remains a major bottleneck for deploying machine learning in ecological applications. Here we present a multi-scale weakly supervised semantic segmentation (WSSS) framework that enables training high-resolution segmentation models from dense, classification-based outputs. Our method combines fine-scale, multi-label predictions from underwater imagery with broad-coverage aerial data. We convert these point-level classifications into coarse supervision masks that can be used to train a semantic segmentation model on Unmanned Aerial Vehicle (UAV) orthophotos. A second training step using the model's own refined predictions is then used to further improve spatial accuracy without requiring additional annotations. We demonstrate the approach on coral reef imagery, enabling large-area segmentation of coral morphotypes and illustrating its flexibility in integrating new classes. The final model achieves 86.07% pixel accuracy and 52.23% mean Intersection over Union (mIoU) on manually annotated reef zones, demonstrating that accurate large-scale coral segmentation can be obtained without pixel-level annotations. By bridging image classification and segmentation across scales and modalities, this method provides an efficient solution for deploying segmentation models in settings where annotations are unavailable and opens opportunities for scalable, efficient monitoring in ecology and beyond.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a multi-scale weakly supervised semantic segmentation (WSSS) framework for coral reef mapping. It converts point-level classifications from underwater imagery into coarse masks to supervise a high-resolution segmentation model on UAV orthophotos, followed by a self-training refinement step using the model's own predictions. The central empirical claim is that this yields 86.07% pixel accuracy and 52.23% mIoU on held-out manually annotated reef zones without requiring pixel-level annotations.
Significance. If the core assumption holds, the approach would meaningfully lower the annotation cost for large-area ecological segmentation by bridging classification outputs across underwater and aerial modalities, with potential for scalable monitoring. The self-training refinement is a standard but useful addition for improving spatial detail from coarse labels.
major comments (2)
- [Abstract] Abstract: the reported 86.07% pixel accuracy and 52.23% mIoU on held-out zones are presented without baseline comparisons, cross-validation details, error bars, or any description of how the manual test annotations were created or aligned to the UAV orthophotos; these omissions are load-bearing for the claim that the method produces accurate segmentation from weak supervision.
- [Method] Method description (implied in abstract): the conversion of underwater point classifications into coarse supervision masks for UAV orthophotos requires a projection/registration step whose spatial accuracy, scale matching, and label noise are not validated; substantial misalignment or viewpoint-induced errors would inject systematic noise into the training targets, and the subsequent self-training step could amplify rather than correct such artifacts, undermining the reported metrics.
minor comments (1)
- [Abstract] The abstract mentions 'coral morphotypes' and 'flexibility in integrating new classes' but does not specify the number of classes or provide a class-wise breakdown of the mIoU.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address the major comments point-by-point below and will make revisions to improve clarity and support for our claims.
read point-by-point responses
-
Referee: [Abstract] Abstract: the reported 86.07% pixel accuracy and 52.23% mIoU on held-out zones are presented without baseline comparisons, cross-validation details, error bars, or any description of how the manual test annotations were created or aligned to the UAV orthophotos; these omissions are load-bearing for the claim that the method produces accurate segmentation from weak supervision.
Authors: We agree that the abstract would benefit from additional context on the evaluation protocol. In the revised version we will expand the abstract to note the inclusion of baseline comparisons (against other WSSS approaches and fully supervised models) that appear in the results, indicate that metrics are obtained via cross-validation on the held-out zones with standard deviations, and briefly describe the expert creation and georeferenced alignment of the manual test annotations to the UAV orthophotos. These additions will make the abstract self-contained while preserving its length. revision: yes
-
Referee: [Method] Method description (implied in abstract): the conversion of underwater point classifications into coarse supervision masks for UAV orthophotos requires a projection/registration step whose spatial accuracy, scale matching, and label noise are not validated; substantial misalignment or viewpoint-induced errors would inject systematic noise into the training targets, and the subsequent self-training step could amplify rather than correct such artifacts, undermining the reported metrics.
Authors: The projection and registration procedure is described in the methods, but we concur that dedicated validation of its accuracy is warranted. In the revision we will add quantitative assessment of registration error (using field control points and overlap statistics), scale-matching verification, and an ablation examining whether self-training reduces rather than amplifies label noise. This will directly address concerns about systematic artifacts in the training targets. revision: yes
Circularity Check
No significant circularity; empirical evaluation on independent manual annotations
full rationale
The paper describes a weakly supervised segmentation pipeline that generates coarse masks from point-level underwater classifications, trains a model on UAV orthophotos, applies self-training refinement, and reports pixel accuracy and mIoU on a separate set of manually annotated reef zones. No equations, fitted parameters, or predictions are presented that reduce the final metrics to definitions or self-referential constructions. The central performance claim rests on held-out manual annotations that are external to the training process. No self-citation chains or uniqueness theorems are invoked as load-bearing elements for the method. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Point-level classifications from underwater imagery can be reliably converted into spatially coarse but usable supervision masks for training a segmentation model on aligned aerial orthophotos.
Reference graph
Works this paper leans on
-
[1]
Loss functions in the era of semantic segmentation: A survey and outlook
Reza Azad, Moein Heidary, Kadir Yilmaz, Michael H ¨uttemann, Sanaz Karimijafarbigloo, Yuli Wu, Anke Schmeink, and Dorit Merhof. Loss functions in the era of semantic segmentation: A survey and outlook. arXiv preprint arXiv:2312.05391, 2023. 5, 16
-
[2]
Towards the fully automated monitoring of ecological communities
Marc Besson, Jamie Alison, Kim Bjerge, Thomas E Gorochowski, Toke T Høye, Tommaso Jucker, Hjalte MR Mann, and Christo- pher F Clements. Towards the fully automated monitoring of ecological communities. Ecology Letters, 25(12):2753–2775, 2022. 2
work page 2022
-
[3]
Angel Borja, Torsten Berg, Hege Gundersen, Anders Gjørwad Hagen, Kasper Hancke, Samuli Korpinen, Miguel C Leal, Tiziana Luisetti, Iratxe Menchaca, Ciaran Murray, et al. Innovative and practical tools for monitoring and assessing biodiversity status and impacts of multiple human pressures in marine systems. Environmental Monitoring and Assessment, 196(8):6...
work page 2024
-
[4]
Mapping benthic biodiversity indicators of coral reefs using spatial interpolation
L ´eo Broudic, Mathieu Pinault, Romain Claud, Touria Bajjouk, T ´evamie Rungassamy, Natacha Nikolic, Estelle Crochelet, Camille Maz´e, and Benjamin Bergerot. Mapping benthic biodiversity indicators of coral reefs using spatial interpolation. Coral Reefs, 2025. 2
work page 2025
-
[5]
Life-saving products from coral reefs
Andrew W Bruckner. Life-saving products from coral reefs. Issues in Science and Technology, 18(3):39–44, 2002. 2
work page 2002
-
[6]
A systematic review of robotic efficacy in coral reef monitoring techniques
Jennifer A Cardenas, Zahra Samadikhoshkho, Ateeq Ur Rehman, Alexander U Valle-P ´erez, Elena Herrera-Ponce de Le ´on, Char- lotte AE Hauser, Eric M Feron, and Rafiq Ahmad. A systematic review of robotic efficacy in coral reef monitoring techniques. Marine Pollution Bulletin, 202:116273, 2024. 2
work page 2024
-
[7]
A comprehensive analysis of weakly-supervised semantic segmen- tation in different image domains
Lyndon Chan, Mahdi S Hosseini, and Konstantinos N Plataniotis. A comprehensive analysis of weakly-supervised semantic segmen- tation in different image domains. International Journal of Computer Vision, 129(2):361–384, 2021. 5
work page 2021
-
[8]
Seatizen atlas image dataset, 2024
Matteo Contini, Julien Barde, Sylvain Bonhommeau, Victor Illien, and Alexis Joly. Seatizen atlas image dataset, 2024. 3
work page 2024
-
[9]
Dinovdeau-large-2024 04 03-with data aug batch-size32 epochs150 freeze – Hugging Face
Matteo Contini, C ´esar Leblanc, and Victor Illien. Dinovdeau-large-2024 04 03-with data aug batch-size32 epochs150 freeze – Hugging Face. https://doi.org/10.57967/hf/2947, 2024. 3
-
[10]
Matteo Contini, Sylvain Bonhommeau, Serge Bernard, and Julien Barde. Aerial images collected by an unmanned aerial vehicle in st-leu, r´eunion - 2023-12-08 (processed data), 2025. 3, 4
work page 2023
-
[11]
Matteo Contini, Sylvain Bonhommeau, Serge Bernard, and Julien Barde. Aerial images collected by an unmanned aerial vehicle in trou-deau, r´eunion - 2023-12-02 (processed data), 2025. 3, 4
work page 2023
-
[12]
Matteo Contini, Victor Illien, Julien Barde, Sylvain Poulain, Serge Bernard, Alexis Joly, and Sylvain Bonhommeau. From underwater to drone: A novel multi-scale knowledge distillation approach for coral reef monitoring. Ecological Informatics, page 103149, 2025. 2, 3, 8
work page 2025
-
[13]
Seatizen atlas: a collaborative dataset of underwater and aerial marine imagery
Matteo Contini, Victor Illien, Mohan Julien, Mervyn Ravitchandirane, Victor Russias, Arthur Lazennec, Thomas Chevrier, Cam Ly Rintz, L´eanne Carpentier, Pierre Gogendeau, et al. Seatizen atlas: a collaborative dataset of underwater and aerial marine imagery. Scientific Data, 12(1):67, 2025. 2, 3
work page 2025
-
[14]
Overcoming the uas limitations in the coastal environment for accurate habitat mapping
Michaela Doukari and Konstantinos Topouzelis. Overcoming the uas limitations in the coastal environment for accurate habitat mapping. Remote Sensing Applications: Society and Environment, 26:100726, 2022. 2
work page 2022
-
[15]
Pierre Gogendeau, Sylvain Bonhommeau, Hassen Fourati, Mohan Julien, Matteo Contini, Thomas Chevrier, Anne Elise Nieblas, and Serge Bernard. An autonomous surface vehicle for acoustic tracking, bathymetric and photogrammetric surveys.Ocean Engineering, 331:121201, 2025. 2, 3
work page 2025
-
[16]
Deny Hidayati et al. The importance of the sustainable use of fishery resources to improve the livelihoods of fishermen on the islands of sumatra and sulawesi, indonesia. In Proceedings of the 5th Conference on Agribusiness, Green Energy, Environment, and Sustainable Development (CAGEES-V5), pages 120–141, 2022. 2
work page 2022
-
[17]
Ove Hoegh-Guldberg, Peter Mumby, A.J. Hooten, R.S. Steneck, Paul Greenfield, Erick Gomez, Catherine Harvell, Peter Sale, Alas- dair Edwards, Ken Caldeira, Nancy Knowlton, C. Mark Eakin, Roberto Iglesias-Prieto, Nyawira Muthiga, Roger Bradbury, Alfonse Dubi, and M Hatziolos. Coral reefs under rapid climate change and ocean acidification. Science (New York,...
-
[18]
Deep-learning-based semantic segmentation of remote sensing images: A survey
Liwei Huang, Bitao Jiang, Shouye Lv, Yanbo Liu, and Ying Fu. Deep-learning-based semantic segmentation of remote sensing images: A survey. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 17:8370–8396, 2024. 2
work page 2024
-
[19]
Climate change, human impacts, and the resilience of coral reefs
Terry P Hughes, Andrew H Baird, David R Bellwood, Margaret Card, Sean R Connolly, Carl Folke, Richard Grosberg, Ove Hoegh- Guldberg, Jeremy BC Jackson, Janice Kleypas, et al. Climate change, human impacts, and the resilience of coral reefs. science, 301 (5635):929–933, 2003. 2
work page 2003
-
[20]
Global warming and recurrent mass bleaching of corals
Terry P Hughes, James T Kerry, Mariana ´Alvarez-Noriega, Jorge G ´Alvarez-Romero, Kristen D Anderson, Andrew H Baird, Russell C Babcock, Maria Beger, David R Bellwood, Ray Berkelmans, et al. Global warming and recurrent mass bleaching of corals. Nature, 543(7645):373–377, 2017. 2
work page 2017
-
[21]
Adam: A Method for Stochastic Optimization
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. 5
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[22]
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexan- der C Berg, Wan-Yen Lo, et al. Segment anything. In Proceedings of the IEEE/CVF international conference on computer vision , pages 4015–4026, 2023. 2, 8
work page 2023
-
[23]
Mapping bio- diversity at very-high resolution in europe
C ´esar Leblanc, Lukas Picek, R ´emi Palard, Benjamin Deneu, Maximilien Servajean, Pierre Bonnet, and Alexis Joly. Mapping bio- diversity at very-high resolution in europe. In Proceedings of the Computer Vision and Pattern Recognition Conference , pages 2349–2358, 2025. 2
work page 2025
-
[24]
Samrefiner: Taming segment anything model for universal mask refinement
Yuqi Lin, Hengjia Li, Wenqi Shao, Zheng Yang, Jun Zhao, Xiaofei He, Ping Luo, and Kaipeng Zhang. Samrefiner: Taming segment anything model for universal mask refinement. arXiv preprint arXiv:2502.06756, 2025. 2, 5
-
[25]
Benjamin Misiuk and Craig J. Brown. Benthic habitat mapping: A review of three decades of mapping biological patterns on the seafloor. Estuarine, Coastal and Shelf Science, 296:108599, 2024. 2
work page 2024
-
[26]
Jos ´ephine Pierrat, L ´ea Urbistondoy, Alexandre Modi, Betsy Viramoutou, and Patrick Frouin. Searching for drivers of the patchy distribution of sympatric deposit-feeding sea cucumbers: A multi-scale monitoring study. Limnology and Oceanography , 69(9): 2057–2070, 2024. 8
work page 2057
-
[27]
Liang Qu, Xiaoli Song, Mengmeng Zhang, Juan Wang, Ruobing Wen, and Shengke Wang. Gan-based defogging and multiscale fusion approach for uav-based seagrass bed imagery semantic segmentation in challenging marine environments. In International Conference of Pioneering Computer Scientists, Engineers and Educators, pages 55–72. Springer, 2024. 2
work page 2024
-
[28]
Fisheries productivity under progressive coral reef degradation
Alice Rogers, Julia Blanchard, and Peter Mumby. Fisheries productivity under progressive coral reef degradation. Journal of Applied Ecology, 55, 2017. 2
work page 2017
-
[29]
Lixiang Ru, Yibing Zhan, Baosheng Yu, and Bo Du. Learning affinity from attention: End-to-end weakly-supervised semantic segmentation with transformers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 16846–16855, 2022. 5
work page 2022
-
[30]
Scalable semantic 3d mapping of coral reefs with deep learning
Jonathan Sauder, Guilhem Banc-Prandi, Anders Meibom, and Devis Tuia. Scalable semantic 3d mapping of coral reefs with deep learning. Methods in Ecology and Evolution, 15(5):916–934, 2024. 2
work page 2024
-
[31]
The coralscapes dataset: Semantic scene understanding in coral reefs
Jonathan Sauder, Viktor Domazetoski, Guilhem Banc-Prandi, Gabriela Perna, Anders Meibom, and Devis Tuia. The coralscapes dataset: Semantic scene understanding in coral reefs. arXiv preprint arXiv:2503.20000, 2025. 2
-
[32]
Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations
Carole H Sudre, Wenqi Li, Tom Vercauteren, Sebastien Ourselin, and M Jorge Cardoso. Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: Third International Workshop, DLMIA 2017, and 7th International Workshop, ML-CDS...
work page 2017
-
[33]
Justine Talpaert Daudon, Matteo Contini, Isabel Urbina-Barreto, Brianna Elliott, Franc ¸ois Guilhaumon, Alexis Joly, Sylvain Bonhom- meau, and Julien Barde. Geoai for marine ecosystem monitoring: a complete workflow to generate maps from ai model predictions. In The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Scien...
work page 2023
-
[34]
Perspectives in machine learning for wildlife conservation
Devis Tuia, Benjamin Kellenberger, Sara Beery, Blair R Costelloe, Silvia Zuffi, Benjamin Risse, Alexander Mathis, Mackenzie W Mathis, Frank Van Langevelde, Tilo Burghardt, et al. Perspectives in machine learning for wildlife conservation. Nature communi- cations, 13(1):1–15, 2022. 2
work page 2022
-
[35]
Chemical pollution on coral reefs: exposure and ecological effects
Joost W van Dam, Andrew P Negri, Sven Uthicke, and Jochen F Mueller. Chemical pollution on coral reefs: exposure and ecological effects. In Ecological Impacts of Toxic Chemicals, pages 187–211. Bentham Science Publishers, 2011. 2
work page 2011
-
[36]
Junling Wang, Yupeng Wang, Liping Liu, Hengfu Yin, Ning Ye, and Can Xu. Weakly supervised forest fire segmentation in uav imagery based on foreground-aware pooling and context-aware loss. Remote Sensing, 15(14):3606, 2023. 2
work page 2023
-
[37]
Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, and Ping Luo. Segformer: Simple and efficient design for semantic segmentation with transformers. In Advances in Neural Information Processing Systems , pages 12077–12090. Curran Associates, Inc., 2021. 2, 5
work page 2021
-
[38]
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu, Ming Li, Chongyang Tao, Tao Shen, Reynold Cheng, Jinyang Li, Can Xu, Dacheng Tao, and Tianyi Zhou. A survey on knowledge distillation of large language models. arXiv preprint arXiv:2402.13116, 2024. 5
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[39]
How to do quantile normalization correctly for gene expression data analyses
Yaxing Zhao, Limsoon Wong, and Wilson Wen Bin Goh. How to do quantile normalization correctly for gene expression data analyses. Scientific reports, 10(1):15534, 2020. 4, 5
work page 2020
-
[40]
Zhuoyi Zhao, Chengyan Fan, and Lin Liu. Geo SAM: A QGIS plugin using Segment Anything Model (SAM) to accelerate geospatial image segmentation, 2023. 8
work page 2023
-
[41]
Kristina Øie Kvile, Hege Gundersen, Robert Nøddebo Poulsen, James Edward Sample, Arnt-Børre Salberg, Medyan Esam Ghareeb, Toms Buls, Trine Bekkby, and Kasper Hancke. Drone and ground-truth data collection, image annotation and machine learning: A protocol for coastal habitat mapping and classification. MethodsX, 13:102935, 2024. 2
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.