AquaSight: Automatic Water Impurity Detection Utilizing Convolutional Neural Networks

Ankit Gupta; Elliott Ruebush

arxiv: 1907.07573 · v1 · pith:IA3MM32Tnew · submitted 2019-07-17 · 💻 cs.LG · cs.CV· stat.ML

AquaSight: Automatic Water Impurity Detection Utilizing Convolutional Neural Networks

Ankit Gupta , Elliott Ruebush This is my paper

Pith reviewed 2026-05-24 20:14 UTC · model grok-4.3

classification 💻 cs.LG cs.CVstat.ML

keywords water impurity detectionconvolutional neural networksdeep learningmobile applicationimage classificationwater qualityturbidity

0 comments

The pith

A convolutional neural network trained on 105 water images detects impurities at 96 percent accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces AquaSight, a mobile application that applies convolutional neural networks to automatically assess water samples for contamination from ordinary photos. The authors assembled a training set of 105 images that vary in contamination level, then report that the resulting model reaches 96 percent accuracy and a loss value of 0.108 while estimating turbidity and transparency. The goal is to give individuals a cheap, accessible way to check drinking water without laboratory equipment. If the approach works as described, it could let users flag polluted sources and notify governments for remediation. The work frames this as one practical response to large-scale water pollution documented by the United Nations.

Core claim

After training a convolutional neural network on 105 images of water with different contamination magnitudes, the model classifies impurity levels by analyzing turbidity and transparency, reaching 96 percent accuracy and enabling a mobile application that supplies rapid water-quality estimates to individuals and authorities.

What carries the argument

Convolutional Neural Networks that classify water images according to visual cues of turbidity and transparency.

If this is right

Individuals obtain a low-cost estimate of their local water quality from a phone photograph.
Contamination alerts can be sent to local and national governments for follow-up action.
The method supplies an alternative to laboratory-based water testing that requires no specialized equipment.
Widespread use could contribute to reduced exposure to polluted water supplies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same image-classification approach might be adapted to monitor other visible environmental indicators such as air quality or soil condition.
Pairing the app output with geographic data could produce crowd-sourced maps of water quality over time.
Expanding the training set beyond 105 images would be a direct next step to test whether accuracy holds for more diverse water bodies.

Load-bearing premise

The collection of 105 images adequately represents the range of real-world water contamination and the trained model will generalize to new samples from varied sources.

What would settle it

Running the published model on several hundred new water photographs collected from multiple independent real-world locations and obtaining accuracy well below 96 percent would falsify the reported performance.

read the original abstract

According to the United Nations World Water Assessment Programme, every day, 2 million tons of sewage and industrial and agricultural waste are discharged into the worlds water. In order to address this pervasive issue of increasing water pollution, while ensuring that the global population has an efficient, accurate, and low cost method to assess whether the water they drink is contaminated, we propose AquaSight, a novel mobile application that utilizes deep learning methods, specifically Convolutional Neural Networks, for automated water impurity detection. After comprehensive training with a dataset of 105 images representing varying magnitudes of contamination, the deep learning algorithm achieved a 96 percent accuracy and loss of 0.108. Furthermore, the machine learning model uses efficient analysis of the turbidity and transparency levels of water to estimate a particular sample of waters level of contamination. When deployed, the AquaSight system will provide an efficient way for individuals to secure an estimation of water quality, alerting local and national government to take action and potentially saving millions of lives worldwide.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The accuracy claim rests on an implausibly small dataset with no validation details, so the 96% figure doesn't support the generalization or deployment claims.

read the letter

The accuracy claim rests on an implausibly small dataset with no validation details, so the 96% figure doesn't support the generalization or deployment claims. The paper takes a standard CNN and trains it on photos of water with different contamination levels to detect impurities. It packages this into a proposed mobile app. That's the extent of the contribution. It does highlight a practical need for cheap water testing tools in places with pollution problems. The execution falls short on the basics. With only 105 images total and no mention of how they were divided for training versus testing, or any external validation set, the result is almost certainly not reliable. CNNs need much larger and more diverse data to learn features that hold up across different lighting, water types, or camera qualities. There's also no baseline comparison, like using simple image statistics for turbidity. The loss of 0.108 is reported but without context on what that means for new samples. The broader claims about alerting governments and saving lives are not connected to any experiments or data in the paper. This kind of work might appeal to people interested in applied mobile apps for public health, but it lacks the rigor for machine learning researchers or environmental scientists. The thinking is not careful enough on the empirical side to make the result believable. I would not bring this to a reading group. I would not cite it. It does not deserve peer review.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes AquaSight, a mobile application that employs convolutional neural networks to detect water impurities from images of water samples. It reports that a CNN trained on a dataset of 105 images achieved 96% accuracy with a loss of 0.108, and claims the system can provide low-cost water quality estimates to alert governments and save lives.

Significance. A validated, accessible mobile tool for turbidity-based water impurity detection could have practical value in low-resource settings. However, the reported result provides no evidence of generalization beyond the training set, limiting any assessment of real-world significance.

major comments (2)

[Abstract] Abstract: The headline result of 96% accuracy (loss 0.108) on 105 images is presented with no description of train/validation/test partitioning, cross-validation procedure, data augmentation, or external test sets from different sources/lighting conditions. This omission makes it impossible to determine whether the metric reflects training performance or true generalization.
[Abstract] Abstract: Standard CNN image classifiers require thousands of examples (or transfer learning plus augmentation) to learn robust features; with N=105 and no mention of these techniques, the reported accuracy is likely to be an overfit training-set figure that will not support the downstream claim of reliable real-world deployment for government alerts.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive comments. We respond point-by-point to the major comments below.

read point-by-point responses

Referee: [Abstract] Abstract: The headline result of 96% accuracy (loss 0.108) on 105 images is presented with no description of train/validation/test partitioning, cross-validation procedure, data augmentation, or external test sets from different sources/lighting conditions. This omission makes it impossible to determine whether the metric reflects training performance or true generalization.

Authors: We agree that the abstract omits these details. The submitted manuscript describes only a dataset of 105 images with no further specification of partitioning, cross-validation, augmentation, or external sets. We will revise the abstract to state the data split used and explicitly note the absence of cross-validation and external test sets. revision: partial
Referee: [Abstract] Abstract: Standard CNN image classifiers require thousands of examples (or transfer learning plus augmentation) to learn robust features; with N=105 and no mention of these techniques, the reported accuracy is likely to be an overfit training-set figure that will not support the downstream claim of reliable real-world deployment for government alerts.

Authors: We acknowledge the validity of this concern. The abstract provides no information on transfer learning or augmentation, and the small dataset size raises a legitimate risk that the reported figure reflects training rather than generalization performance. We will revise the manuscript to discuss these limitations explicitly and moderate the claims regarding real-world deployment and life-saving potential to reflect the preliminary character of the study. revision: yes

standing simulated objections not resolved

No external test sets from different sources or lighting conditions exist in the study, so evidence of generalization beyond the collected images cannot be supplied.

Circularity Check

0 steps flagged

No circularity; empirical performance report with no derivation chain

full rationale

The paper reports training a CNN on 105 images and states an achieved accuracy of 96% with loss 0.108. No equations, derivations, first-principles claims, or load-bearing self-citations appear in the provided text. The result is presented as an empirical outcome of training rather than a derived prediction that reduces to its inputs by construction. No steps match the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim depends on the empirical performance of a standard CNN without new theoretical contributions or invented concepts.

free parameters (1)

CNN hyperparameters
The model training involves choosing architecture and optimization parameters, but none are specified in the abstract.

axioms (1)

domain assumption The 105 images sufficiently represent contamination variations
The training relies on this dataset being adequate for the task.

pith-pipeline@v0.9.0 · 5699 in / 1404 out tokens · 31719 ms · 2026-05-24T20:14:41.912263+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages · 1 internal anchor

[1]

(2018, March 22)

AI-Driven Test System Detects Bacteria in Water. (2018, March 22). Retrieved from https://software.intel.com/en- us/articles/ai-driven-test-system-detects-bacteria-in-water

work page 2018
[2]

C., Meier, U., Masci, J., Gambardella, L

Ciresan, D. C., Meier, U., Masci, J., Gambardella, L. M., Schmidhuber, J. (2011, June). Flexible, high performance convolutional neural networks for image classiﬁcation. In Twenty-Second International Joint Conference on Artiﬁcial Intelligence

work page 2011
[3]

Krizhevsky, A., Sutskever, I., Hinton, G. E. (2012). Im- agenet classiﬁcation with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105)

work page 2012
[4]

J., Fuller, R., Acosta, N

Landrigan, P. J., Fuller, R., Acosta, N. J., Adeyi, O., Arnold, R., Bald, A. B., ... Chiles, T. (2018). The Lancet Commission on pollution and health. The Lancet, 391(10119), 462-512

work page 2018
[5]

M., Turral, H

Mateo-Sagasta, J., Zadeh, S. M., Turral, H. (Eds.). (2018). More people, more food, worse water?: a global review of water pollution from agriculture. Rome, Italy: FAO Colombo, Sri Lanka: International Water Management Insti- tute (IWMI). CGIAR Research Program on Water, Land and Ecosystems (WLE)

work page 2018
[6]

World Population Prospects: The 2017 Revision, Key Findings and Advance Tables

United Nations, Department of Economic and Social Af- fairs, Population Division (2017). World Population Prospects: The 2017 Revision, Key Findings and Advance Tables. Work- ing Paper No. ESA/P/WP/248

work page 2017
[7]

Algae Detection Using Computer Vision and Deep Learning

Samantaray, A., Yang, B., Dietz, J. E., Min, B. C. (2018). Algae Detection Using Computer Vision and Deep Learning. arXiv preprint arXiv:1811.10847

work page internal anchor Pith review Pith/arXiv arXiv 2018
[8]

Schwartz, J., Levin, R. (1999). Drinking water turbidity and health. Epidemiology, 86-90

work page 1999
[9]

Toivanen, T., Koponen, S., Kotovirta, V ., Molinier, M., Chengyuan, P. (2013). Water quality analysis using an inex- pensive device and a mobile phone. Environmental Systems Research, 2(1), 9

work page 2013
[10]

(2017, July 12)

WHO/UNICEF Joint Monitoring Programme. (2017, July 12). 2.1 billion people lack safe drinking water at home, more than twice as many lack safe sanitation [Press release]. Retrieved from https://www.who.int/news-room/detail/12-07- 2017-2-1-billion-people-lack-safe-drinking-water-at-home- more-than-twice-as-many-lack-safe-sanitation

work page 2017
[11]

Codevilla, F., Gaya, J. D. O., Duarte, N., Botelho, S. (2004). Achieving turbidity robustness on underwater images local feature detection. International journal of computer vision, 60(2), 91-110

work page 2004
[12]

(2015, September)

Ge, Z., McCool, C., Sanderson, C., Corke, P. (2015, September). Modelling local deep convolutional neural net- work features to improve ﬁne-grained image classiﬁcation. In 2015 IEEE International Conference on Image Processing (ICIP) (pp. 4112-4116) . IEEE

work page 2015
[13]

S., Nanda, S

Mahapatra, S. S., Nanda, S. K., Panigrahy, B. K. (2011). A Cascaded Fuzzy Inference System for Indian river water quality prediction. Advances in Engineering Software , 42(10), 787-796

work page 2011
[14]

Yuan, F., Huang, Y ., Chen, X., Cheng, E. (2018). A Biological Sensor System Using Computer Vision for Water Quality Monitoring. IEEE Access, 6, 61535-61546

work page 2018
[15]

Zhang, Y ., Pulliainen, J., Koponen, S., Hallikainen, M. (2002). Application of an empirical neural network to surface water quality estimation in the Gulf of Finland using combined optical data and microwave data. Remote sensing of environment, 81(2-3), 327-336

work page 2002

[1] [1]

(2018, March 22)

AI-Driven Test System Detects Bacteria in Water. (2018, March 22). Retrieved from https://software.intel.com/en- us/articles/ai-driven-test-system-detects-bacteria-in-water

work page 2018

[2] [2]

C., Meier, U., Masci, J., Gambardella, L

Ciresan, D. C., Meier, U., Masci, J., Gambardella, L. M., Schmidhuber, J. (2011, June). Flexible, high performance convolutional neural networks for image classiﬁcation. In Twenty-Second International Joint Conference on Artiﬁcial Intelligence

work page 2011

[3] [3]

Krizhevsky, A., Sutskever, I., Hinton, G. E. (2012). Im- agenet classiﬁcation with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105)

work page 2012

[4] [4]

J., Fuller, R., Acosta, N

Landrigan, P. J., Fuller, R., Acosta, N. J., Adeyi, O., Arnold, R., Bald, A. B., ... Chiles, T. (2018). The Lancet Commission on pollution and health. The Lancet, 391(10119), 462-512

work page 2018

[5] [5]

M., Turral, H

Mateo-Sagasta, J., Zadeh, S. M., Turral, H. (Eds.). (2018). More people, more food, worse water?: a global review of water pollution from agriculture. Rome, Italy: FAO Colombo, Sri Lanka: International Water Management Insti- tute (IWMI). CGIAR Research Program on Water, Land and Ecosystems (WLE)

work page 2018

[6] [6]

World Population Prospects: The 2017 Revision, Key Findings and Advance Tables

United Nations, Department of Economic and Social Af- fairs, Population Division (2017). World Population Prospects: The 2017 Revision, Key Findings and Advance Tables. Work- ing Paper No. ESA/P/WP/248

work page 2017

[7] [7]

Algae Detection Using Computer Vision and Deep Learning

Samantaray, A., Yang, B., Dietz, J. E., Min, B. C. (2018). Algae Detection Using Computer Vision and Deep Learning. arXiv preprint arXiv:1811.10847

work page internal anchor Pith review Pith/arXiv arXiv 2018

[8] [8]

Schwartz, J., Levin, R. (1999). Drinking water turbidity and health. Epidemiology, 86-90

work page 1999

[9] [9]

Toivanen, T., Koponen, S., Kotovirta, V ., Molinier, M., Chengyuan, P. (2013). Water quality analysis using an inex- pensive device and a mobile phone. Environmental Systems Research, 2(1), 9

work page 2013

[10] [10]

(2017, July 12)

WHO/UNICEF Joint Monitoring Programme. (2017, July 12). 2.1 billion people lack safe drinking water at home, more than twice as many lack safe sanitation [Press release]. Retrieved from https://www.who.int/news-room/detail/12-07- 2017-2-1-billion-people-lack-safe-drinking-water-at-home- more-than-twice-as-many-lack-safe-sanitation

work page 2017

[11] [11]

Codevilla, F., Gaya, J. D. O., Duarte, N., Botelho, S. (2004). Achieving turbidity robustness on underwater images local feature detection. International journal of computer vision, 60(2), 91-110

work page 2004

[12] [12]

(2015, September)

Ge, Z., McCool, C., Sanderson, C., Corke, P. (2015, September). Modelling local deep convolutional neural net- work features to improve ﬁne-grained image classiﬁcation. In 2015 IEEE International Conference on Image Processing (ICIP) (pp. 4112-4116) . IEEE

work page 2015

[13] [13]

S., Nanda, S

Mahapatra, S. S., Nanda, S. K., Panigrahy, B. K. (2011). A Cascaded Fuzzy Inference System for Indian river water quality prediction. Advances in Engineering Software , 42(10), 787-796

work page 2011

[14] [14]

Yuan, F., Huang, Y ., Chen, X., Cheng, E. (2018). A Biological Sensor System Using Computer Vision for Water Quality Monitoring. IEEE Access, 6, 61535-61546

work page 2018

[15] [15]

Zhang, Y ., Pulliainen, J., Koponen, S., Hallikainen, M. (2002). Application of an empirical neural network to surface water quality estimation in the Gulf of Finland using combined optical data and microwave data. Remote sensing of environment, 81(2-3), 327-336

work page 2002