Underwater Waste Detection Using Deep Learning A Performance Comparison of YOLOv7 to 10 and Faster RCNN
Pith reviewed 2026-05-19 02:58 UTC · model grok-4.3
The pith
YOLOv8 achieves 80.9% mAP for underwater waste detection, surpassing YOLOv7 to YOLOv10 and Faster R-CNN.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Among the five object recognition algorithms examined, YOLOv8 proves most effective at recognizing materials in underwater situations, attaining a mean Average Precision of 80.9% on a 15-class dataset, thanks to its incorporation of improved anchor-free mechanisms and self-supervised learning.
What carries the argument
Comparative evaluation of YOLOv7 through YOLOv10 and Faster R-CNN models on an underwater waste dataset with 15 classes, highlighting YOLOv8's architectural advantages for precise detection in challenging marine environments.
If this is right
- Enhances the accuracy and efficiency of underwater pollution monitoring programs.
- Supports the development of more scalable systems for marine waste management.
- Improves detection capabilities in low-visibility and variable-depth conditions common in real-world scenarios.
Where Pith is reading between the lines
- Integration with robotic systems could automate waste collection in underwater environments.
- Similar model comparisons might benefit detection tasks in other obscured settings like fog or dense vegetation.
- Further optimization could focus on real-time processing speeds for deployment on underwater drones.
Load-bearing premise
The 15-class dataset collected under diverse underwater conditions adequately represents the variety of real-world waste and environments for the performance results to apply generally.
What would settle it
Evaluating the models on a separate underwater dataset from different locations or with additional waste categories where YOLOv8 no longer shows the highest mAP.
read the original abstract
Underwater pollution is one of today's most significant environmental concerns, with vast volumes of garbage found in seas, rivers, and landscapes around the world. Accurate detection of these waste materials is crucial for successful waste management, environmental monitoring, and mitigation strategies. In this study, we investigated the performance of five cutting-edge object recognition algorithms, namely YOLO (You Only Look Once) models, including YOLOv7, YOLOv8, YOLOv9, YOLOv10, and Faster Region-Convolutional Neural Network (R-CNN), to identify which model was most effective at recognizing materials in underwater situations. The models were thoroughly trained and tested on a large dataset containing fifteen different classes under diverse conditions, such as low visibility and variable depths. From the above-mentioned models, YOLOv8 outperformed the others, with a mean Average Precision (mAP) of 80.9%, indicating a significant performance. This increased performance is attributed to YOLOv8's architecture, which incorporates advanced features such as improved anchor-free mechanisms and self-supervised learning, allowing for more precise and efficient recognition of items in a variety of settings. These findings highlight the YOLOv8 model's potential as an effective tool in the global fight against pollution, improving both the detection capabilities and scalability of underwater cleanup operations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reports an empirical comparison of five object detection models—YOLOv7, YOLOv8, YOLOv9, YOLOv10, and Faster R-CNN—on a 15-class underwater waste dataset collected under conditions of low visibility and variable depths. It states that YOLOv8 achieves the highest mean Average Precision (mAP) of 80.9% and attributes this result to architectural elements including improved anchor-free mechanisms and self-supervised learning.
Significance. A well-controlled comparison of these models on underwater imagery could offer practical guidance for environmental monitoring and cleanup robotics. The current manuscript, however, provides no evidence that performance differences arise from the cited architectural features rather than uncontrolled differences in training protocols, making the central claim difficult to interpret or reproduce.
major comments (3)
- [Abstract] Abstract: The attribution of the 80.9% mAP to YOLOv8's 'improved anchor-free mechanisms and self-supervised learning' is unsupported. No section describes a unified training protocol (identical optimizer, epoch count, batch size, learning rate schedule, or data augmentation pipeline) applied to all five models; without such isolation, mAP gaps cannot be causally linked to architecture.
- [Methods/Experimental Setup] Methods/Experimental Setup (inferred from absence in Abstract and Results): The manuscript supplies no information on dataset partitioning, validation strategy, pre-training usage, or loss weighting. These details are load-bearing for any claim that one detector outperforms the others on the 15-class underwater task.
- [Results] Results: No error bars, standard deviations across runs, or statistical significance tests accompany the reported mAP values. A single scalar of 80.9% for YOLOv8 cannot be evaluated for robustness or superiority without these measures.
minor comments (2)
- [Title] Title: 'YOLOv7 to 10' is imprecise; it should explicitly list YOLOv7 through YOLOv10 for clarity.
- [Abstract] Abstract: The phrase 'self-supervised learning' is used without definition or citation in the context of YOLOv8; standard YOLOv8 training is supervised, so the term requires clarification or removal.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback, which identifies important gaps in methodological transparency and statistical rigor. We address each major comment below and will revise the manuscript to strengthen reproducibility and support for our claims without overstating the current evidence.
read point-by-point responses
-
Referee: [Abstract] Abstract: The attribution of the 80.9% mAP to YOLOv8's 'improved anchor-free mechanisms and self-supervised learning' is unsupported. No section describes a unified training protocol (identical optimizer, epoch count, batch size, learning rate schedule, or data augmentation pipeline) applied to all five models; without such isolation, mAP gaps cannot be causally linked to architecture.
Authors: We agree that the abstract's attribution requires supporting details to establish a causal link. The manuscript states that models were 'thoroughly trained and tested' under similar conditions, but does not explicitly document a unified protocol. In the revision we will expand the Methods section with a table and description of the shared training configuration (optimizer, epochs, batch size, LR schedule, and augmentations) applied to all models, and we will revise the abstract to qualify the architectural attribution accordingly. revision: yes
-
Referee: [Methods/Experimental Setup] Methods/Experimental Setup (inferred from absence in Abstract and Results): The manuscript supplies no information on dataset partitioning, validation strategy, pre-training usage, or loss weighting. These details are load-bearing for any claim that one detector outperforms the others on the 15-class underwater task.
Authors: We acknowledge that these implementation details were omitted. The full manuscript text does not contain a dedicated experimental-setup subsection covering splits, validation, pre-training, or loss weighting. In the revised version we will add this information, specifying the train/validation/test partitioning ratios, any cross-validation procedure, use of COCO pre-trained weights, and class-balanced loss weighting for the 15 underwater waste categories. revision: yes
-
Referee: [Results] Results: No error bars, standard deviations across runs, or statistical significance tests accompany the reported mAP values. A single scalar of 80.9% for YOLOv8 cannot be evaluated for robustness or superiority without these measures.
Authors: We concur that single-run mAP values limit assessment of robustness. The current results section reports only point estimates. For the revision we will rerun the top-performing models with multiple random seeds to report mean mAP and standard deviation, and we will add a statistical comparison (e.g., paired t-test or Wilcoxon test) between YOLOv8 and the next-best model. If resource constraints prevent full multi-run experiments, we will explicitly note this limitation and qualify the superiority claim. revision: partial
Circularity Check
No circularity: purely empirical comparison with no derivations
full rationale
The paper reports mAP results from training YOLOv7–v10 and Faster R-CNN on a 15-class underwater dataset. No equations, derivations, or self-citations appear in the provided text. The claim that YOLOv8's 80.9% mAP stems from its architecture is an interpretive statement about observed results rather than a reduction of any prediction to fitted inputs or self-referential definitions. The work is self-contained as an empirical benchmark study.
Axiom & Free-Parameter Ledger
free parameters (1)
- training hyperparameters
axioms (1)
- domain assumption Standard supervised learning assumptions hold for object detection on the collected underwater images.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
YOLOv8 outperformed the others, with a mean Average Precision (mAP) of 80.9%... attributed to YOLOv8's architecture, which incorporates advanced features such as improved anchor-free mechanisms and self-supervised learning
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The models were thoroughly trained and tested on a large dataset containing fifteen different classes under diverse conditions, such as low visibility and variable depths.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
The impact of debris on marine life,
S. C. Gall and R. C. Thompson, “The impact of debris on marine life,” Mar. Pollut. Bull., vol. 92, no. 1–2, pp. 170–179, Mar. 2015, doi: 10.1016/j.marpolbul.2014.12.041
-
[2]
Garbage Detection using Advanced Object Detection Techniques,
D. Patel, F. Patel, S. Patel, N. Patel, D. Shah, and V. Patel, “Garbage Detection using Advanced Object Detection Techniques,” in Proceedings - International Conference on Artificial Intelligence and Smart Systems, ICAIS 2021, Institute of Electrical and Electronics Engineers Inc., Mar. 2021, pp. 526 –531. doi: 10.1109/ICAIS50930.2021.9395916
-
[3]
J. A. van F. Chelsea M. Rochman , Mark Anthony Browne , A. J. Underwood and L. A. A.-Z. Richard C. Thompson, “rochman2015”
-
[4]
Global Coral Reef Monitoring Network and Reef and Rainforest Research Centre: Townsville,
C. Wilkinson and D. Souter, “Global Coral Reef Monitoring Network and Reef and Rainforest Research Centre: Townsville,” AU, USA, pp. 31–36, 2008
work page 2008
-
[5]
The economic cost and control of marine debris damage in the Asia-Pacific region,
A. McIlgorm, H. F. Campbell, and M. J. Rule, “The economic cost and control of marine debris damage in the Asia-Pacific region,” Ocean Coast. Manag., vol. 54, no. 9, pp. 643 –651, Sep. 2011, doi: 10.1016/j.ocecoaman.2011.05.007
-
[6]
M. S. Islam and M. Tanaka, “Impacts of pollution on coastal and marine ecosystems including coastal and marine fisheries and approach for management: A review and synthesis,” Apr. 2004. doi: 10.1016/j.marpolbul.2003.12.004
-
[7]
The economics of marine litter,
S. Newman, E. Watkins, A. Farmer, P. ten Brink, and J. -P. Schweitzer, “The economics of marine litter,” Mar. Anthropog. litter, pp. 367–394, 2015
work page 2015
-
[8]
An Image Processing Approach to Identify Plastic Garbage,
A. Rahman, W. R. Bristi, A. T. Misha, and others, “An Image Processing Approach to Identify Plastic Garbage,” in 2021 International Conference on Electronics, Communications and Information Technology (ICECIT), 2021, pp. 1–4
work page 2021
-
[9]
Garbage Classification Detection Based on Improved YOLOV4,
Q. Chen and Q. Xiong, “Garbage Classification Detection Based on Improved YOLOV4,” J. Comput. Commun., vol. 08, no. 12, pp. 285–294, 2020, doi: 10.4236/jcc.2020.812023
-
[10]
O. Goxhaj, N. G. Yilmaz, L. Kouhalvandi, I. Shayea, and A. Azizan, “Underwater Image Detection for Cleaning Purposes; Techniques Used for Detection Based on Machine Learning,” Acta Marisiensis. Ser. Technol., vol. 19, no. 1, pp. 28 –35, Jun. 2022, doi: 10.2478/amset-2022-0006
-
[11]
The Object Detection of Underwater Garbage with an Improved YOLOv5 Algorithm,
X. Teng, Y. Fei, K. He, and L. Lu, “The Object Detection of Underwater Garbage with an Improved YOLOv5 Algorithm,” in Proceedings of the 2022 International Conference on Pattern Recognition and Intelligent Systems, 2022, pp. 55–60
work page 2022
-
[12]
Detection of Underwater Plastic Waste Based on Improved YOLOv5n,
Z. Hu and C. Xu, “Detection of Underwater Plastic Waste Based on Improved YOLOv5n,” pp. 404 –408, Dec. 2022, doi: https://doi.org/10.1109/icftic57696.2022.10075134
-
[13]
Underwater object detection algorithm based on an improved YOLOv8,
F. Zhang et al., “Underwater object detection algorithm based on an improved YOLOv8,” Journal of Marine Science and Engineering, vol. 12, no. 11, p. 1991, 2024
work page 1991
-
[14]
A lightweight YOLOv8 integrating FasterNet for real -time underwater object detection,
A. Guo, K. Sun, and Z. Zhang, “A lightweight YOLOv8 integrating FasterNet for real -time underwater object detection,” Journal of Real-Time Image Processing, vol. 21, no. 2, p. 49, 2024
work page 2024
-
[15]
Underwater Plastic Pollution Detection
A. Samal , “Underwater Plastic Pollution Detection.” [Online]. Available: Underwater Plastic Pollution Detection%0A
-
[16]
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
C.-Y. Wang, A. Bochkovskiy, and H. -Y. M. Liao, “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors.” [Online]. Available: https://github.com/
-
[17]
G. Jocher, A. Chaurasia, and J. Qiu, “Ultralytics YOLO,” 2023. [Online]. Available: https://ultralytics.com
work page 2023
- [18]
-
[19]
You Only Look Once: Unified, Real -Time Object Detection
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real -Time Object Detection.” [Online]. Available: http://pjreddie.com/yolo/
-
[20]
arXiv preprint arXiv:2405.14458 , year=
A. Wang et al., “YOLOv10: Real -Time End -to-End Object Detection,” May 2024, [Online]. Available: http://arxiv.org/abs/2405.14458
-
[21]
Faster R-CNN: Towards real-time object detection with region proposal networks,
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2016
work page 2016
-
[22]
Scikit-learn: Machine Learning in Python,
F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011
work page 2011
-
[23]
Ultralytics, “Ultralytics YOLO Docs.” [Online]. Available: https://docs.ultralytics.com/guides/yolo-performance- metrics/#object-detection-metrics
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.