pith. sign in

arxiv: 1906.10852 · v1 · pith:MZRZK55Xnew · submitted 2019-06-26 · 💻 cs.NE

Water Preservation in Soan River Basin using Deep Learning Techniques

Pith reviewed 2026-05-25 15:23 UTC · model grok-4.3

classification 💻 cs.NE
keywords stream flow predictiondeep learningRNNLSTMSoan River Basinwater preservationhydrologyclimate variables
0
0 comments X

The pith

RNN and LSTM models outperform conventional algorithms for predicting stream flow in the Soan River Basin.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that deep learning techniques, specifically Recurrent Neural Networks and Long Short-term Memory networks, give better predictions of stream flow than traditional or other machine learning methods when using current climate data. Accurate forecasts matter for managing water supplies amid shifting climate and land use patterns that affect hydrological processes. The work also identifies precipitation, land usage, and temperature as direct influences on stream flow that hydrologists can track. The authors release their dataset to allow others to check and extend the findings.

Core claim

The central claim is that RNN or LSTM, as artificial neural network based methods, outperform other conventional and machine-learning algorithms for predicting stream flow given the present climate conditions. Stream flow is directly affected by precipitation, land usage, and temperature, and these indexes can be used by hydrologists to identify the potential for stream flow.

What carries the argument

Recurrent Neural Network (RNN) and Long Short-term Memory (LSTM) models applied to time-series prediction of stream flow from climate and land-use inputs.

If this is right

  • Hydrologists can use precipitation, land usage, and temperature as key indexes to assess stream flow potential.
  • Deep learning models can support more reliable planning for water preservation under changing conditions.
  • Public release of the dataset enables replication and further model development by others.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar deep learning approaches might improve stream flow forecasts in other river basins with comparable data availability.
  • Incorporating additional variables such as soil type or evaporation rates could test and potentially strengthen the models.
  • Longer-term application of these forecasts could inform infrastructure decisions for drought or flood mitigation.

Load-bearing premise

The climate and land-use variables in the dataset are sufficient to capture the dominant drivers of stream flow without major unmodeled influences or data limitations affecting generalization.

What would settle it

A test showing that a conventional machine learning algorithm achieves higher prediction accuracy than RNN or LSTM on the same Soan River dataset or a comparable one would falsify the performance claim.

Figures

Figures reproduced from arXiv: 1906.10852 by Muhammad Shahid, Muhammad Waqas, Nan Wei, Obaid Ur Rehman, Sadaqat Ur Rehman, Shanshan Tu, Yongfeng Huang, Zhongliang Yang.

Figure 1
Figure 1. Figure 1: The overall framework of the proposed model, (a) CNN, (b) LSTM. These two models are used to extract the [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The Location map of the Soan river basin and Distribution of Hydro-meteorological station [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Precipitation and stream flow statistics of the Soan sub-basin1 for the year 1983-2012 [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Precipitation and streamflow statistics of the Soan sub-basin2 for the year1983-2012 [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Prediction results using LSTM. The horizontal [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Prediction error changes with days 11 [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗
read the original abstract

Water supplies are crucial for the development of living beings. However, change in the hydrological process i.e. climate and land usage are the key issues. Sustaining water level and accurate estimating for dynamic conditions is a critical job for hydrologists, but predicting hydrological extremes is an open issue. In this paper, we proposed two deep learning techniques and three machine learning algorithms to predict stream flow, given the present climate conditions. The results showed that the Recurrent Neural Network (RNN) or Long Short-term Memory (LSTM), an artificial neural network based method, outperform other conventional and machine-learning algorithms for predicting stream flow. Furthermore, we analyzed that stream flow is directly affected by precipitation, land usage, and temperature. These indexes are critical, which can be used by hydrologists to identify the potential for stream flow. We make the dataset publicly available (https://github.com/sadaqat007/Dataset) so that others should be able to replicate and build upon the results published.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes RNN and LSTM models (alongside three unspecified machine-learning algorithms) to predict stream flow in the Soan River Basin from climate and land-use inputs. It claims that the recurrent architectures outperform the baselines, that stream flow is directly affected by precipitation, land usage, and temperature, and releases the dataset publicly for replication.

Significance. If the outperformance claim survives proper temporal validation and equivalent baseline tuning, the work could support practical hydrological forecasting for water preservation. The public dataset release is a clear positive for reproducibility in the field.

major comments (2)
  1. [Abstract/Results] Abstract and Results: the central claim that RNN/LSTM 'outperform other conventional and machine-learning algorithms' supplies no error metrics, no description of the three baseline algorithms, no hyperparameter search ranges, and no indication of whether a strict future-holdout split was used; without these details the reported superiority cannot be evaluated.
  2. [Methods] Methods: no information is given on feature preprocessing, temporal ordering of train/test data, or handling of time-series dependence; this directly undermines the load-bearing assumption that any observed accuracy gain arises from architecture rather than leakage or unequal tuning effort.
minor comments (2)
  1. [Abstract] Abstract: the phrasing 'RNN or LSTM' is ambiguous; clarify whether both architectures were evaluated separately and report their individual metrics.
  2. [Abstract] The statement that the listed variables 'directly affect' stream flow is presented without supporting correlation analysis or ablation results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The comments highlight important omissions in the reporting of experimental details that are necessary to substantiate the performance claims. We will revise the manuscript to supply the missing information on metrics, baselines, hyperparameters, validation splits, preprocessing, and temporal handling. Below we respond point by point.

read point-by-point responses
  1. Referee: [Abstract/Results] Abstract and Results: the central claim that RNN/LSTM 'outperform other conventional and machine-learning algorithms' supplies no error metrics, no description of the three baseline algorithms, no hyperparameter search ranges, and no indication of whether a strict future-holdout split was used; without these details the reported superiority cannot be evaluated.

    Authors: We agree that the current manuscript does not report error metrics, name the three baseline algorithms, describe hyperparameter search ranges, or state whether a strict future-holdout temporal split was employed. These omissions prevent independent evaluation of the superiority claim. In the revised version we will add a dedicated Results subsection that reports quantitative metrics (RMSE, MAE, NSE), explicitly names and briefly describes the three baseline algorithms, documents the hyperparameter ranges and search method used for all models, and confirms that training and test sets were formed with a strict future-holdout split to respect temporal order. revision: yes

  2. Referee: [Methods] Methods: no information is given on feature preprocessing, temporal ordering of train/test data, or handling of time-series dependence; this directly undermines the load-bearing assumption that any observed accuracy gain arises from architecture rather than leakage or unequal tuning effort.

    Authors: The manuscript indeed provides no description of feature preprocessing steps, the procedure used to enforce temporal ordering between training and test data, or any explicit measures taken to avoid leakage from time-series autocorrelation. We acknowledge that this information is required to attribute performance differences to model architecture. The revised Methods section will include: (i) the exact preprocessing pipeline (normalization, missing-value handling, feature scaling), (ii) the chronological split dates and rationale, and (iii) any techniques applied to mitigate temporal dependence (e.g., lagged features, walk-forward validation). revision: yes

Circularity Check

0 steps flagged

No circularity: empirical ML comparison with no derivation chain or self-referential claims

full rationale

The paper presents an empirical study applying RNN/LSTM and conventional ML algorithms to a public stream-flow dataset. No equations, derivations, or mathematical claims are made that could reduce to inputs by construction. Performance assertions rest on reported experimental results rather than any fitted parameter renamed as a prediction or any self-citation chain. The absence of a derivation chain means none of the enumerated circularity patterns apply.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available, so no specific free parameters, axioms, or invented entities can be extracted or audited from the provided text.

pith-pipeline@v0.9.0 · 5726 in / 881 out tokens · 19598 ms · 2026-05-25T15:23:45.121461+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · 4 internal anchors

  1. [1]

    Evidence for global runoff increase related to climate warming

    David Labat, Yves Goddéris, Jean Luc Probst, and Jean Loup Guyot. Evidence for global runoff increase related to climate warming. Advances in Water Resources, 27(6):631–642, 2004

  2. [2]

    Detection of a direct carbon dioxide effect in continental river runoff records

    Nicola Gedney, PM Cox, RA Betts, O Boucher, C Huntingford, and PA Stott. Detection of a direct carbon dioxide effect in continental river runoff records. Nature, 439(7078):835, 2006

  3. [3]

    Streamflow hydrology estimate using machine learning (shem)

    TR Petty and P Dhingra. Streamflow hydrology estimate using machine learning (shem). JAWRA Journal of the American Water Resources Association, 54(1):55–68, 2018

  4. [4]

    Effects of urbanization and climate change on peak flows over the san antonio river basin, texas

    Gang Zhao, Huilin Gao, and Lan Cuo. Effects of urbanization and climate change on peak flows over the san antonio river basin, texas. Journal of Hydrometeorology, 17(9):2371–2389, 2016

  5. [5]

    Evaluating climate change impacts on streamflow variability based on a multisite multivariate gcm downscaling method in the jing river of china

    Zhi Li and Jiming Jin. Evaluating climate change impacts on streamflow variability based on a multisite multivariate gcm downscaling method in the jing river of china. Hydrology and Earth System Sciences, 21(11):5531–5546, 2017

  6. [6]

    Effects of climate change on streamflow extremes and implications for reservoir inflow in the united states

    Bibi S Naz, Shih-Chieh Kao, Moetasim Ashfaq, Huilin Gao, Deeksha Rastogi, and Sudershan Gangrade. Effects of climate change on streamflow extremes and implications for reservoir inflow in the united states. Journal of Hydrology, 556:359–370, 2018

  7. [7]

    Application of artificial neural network, fuzzy logic and decision tree algorithms for modelling of streamflow at kasol in india

    Manish Kumar Goyal, CSP Ojha, RD Singh, PK Swamee, et al. Application of artificial neural network, fuzzy logic and decision tree algorithms for modelling of streamflow at kasol in india. Water Science and Technology, 68(12):2521–2526, 2013

  8. [8]

    Flood prediction using machine learning

    Amir Mosavi, Pinar Ozturk, Shahab Shamshirband, Hai Thanh Nguyen, and Kwok-wing Chau. Flood prediction using machine learning. Literature Review, Engineering Applications of Computational Fluid Mechanics, 2018. 12 Water Preservation in Soan River Basin Using Deep Learning Techniques A PREPRINT

  9. [9]

    Deep learning with a long short-term memory networks approach for rainfall-runoff simulation

    Caihong Hu, Qiang Wu, Hui Li, Shengqi Jian, Nan Li, and Zhengzheng Lou. Deep learning with a long short-term memory networks approach for rainfall-runoff simulation. Water, 10(11):1543, 2018

  10. [10]

    Estimation of leakage ratio using principal component analysis and artificial neural network in water distribution systems.Sustainability, 10(3):750, 2018

    Dongwoo Jang, Hyoseon Park, and Gyewoon Choi. Estimation of leakage ratio using principal component analysis and artificial neural network in water distribution systems.Sustainability, 10(3):750, 2018

  11. [11]

    Dongting lake water level forecast and its relationship with the three gorges dam based on a long short-term memory network

    Chen Liang, Hongqing Li, Mingjun Lei, and Qingyun Du. Dongting lake water level forecast and its relationship with the three gorges dam based on a long short-term memory network. Water, 10(10):1389, 2018

  12. [12]

    Sills J. et al. Nextgen voices: Unique identities[j]. Science, 363(6428):702–702, 2019

  13. [13]

    Imagenet classification with deep convolutional neural networks

    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012

  14. [14]

    Very Deep Convolutional Networks for Large-Scale Image Recognition

    Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014

  15. [15]

    Going deeper with convolutions

    Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015

  16. [16]

    A Convolutional Neural Network for Modelling Sentences

    Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188, 2014

  17. [17]

    Convolutional Neural Networks for Sentence Classification

    Yoon Kim. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882, 2014

  18. [18]

    Visualizing and understanding convolutional networks

    Matthew D Zeiler and Rob Fergus. Visualizing and understanding convolutional networks. InEuropean conference on computer vision, pages 818–833. Springer, 2014

  19. [19]

    The Indus basin of Pakistan: The impacts of climate risks on water and agriculture

    Winston Yu, Yi-Chen Yang, Andre Savitsky, Donald Alford, Casey Brown, James Wescoat, Dario Debowicz, and Sherman Robinson. The Indus basin of Pakistan: The impacts of climate risks on water and agriculture . The World Bank, 2013

  20. [20]

    Evaluation of development and land use change effects on rainfall-runoff and runoff-sediment relations of catchment area of simly lake pakistan

    Muhammad Shahid, Hamza Farooq Gabriel, Amjad Nabi, Sajjad Haider, AS Khan, and AMS Shah. Evaluation of development and land use change effects on rainfall-runoff and runoff-sediment relations of catchment area of simly lake pakistan. Life Science Journal, 11(7s), 2014

  21. [21]

    Spatial and temporal assessment of groundwater behaviour in the soan basin of pakistan

    A Ashfaq, M Ashraf, and A Bahzad. Spatial and temporal assessment of groundwater behaviour in the soan basin of pakistan. University of Engineering and Technology Taxila. Technical Journal, 19(1):12, 2014

  22. [22]

    Understanding the impacts of climate change and human activities on streamflow: a case study of the soan river basin, pakistan

    Muhammad Shahid, Zhentao Cong, and Danwu Zhang. Understanding the impacts of climate change and human activities on streamflow: a case study of the soan river basin, pakistan. Theoretical and Applied Climatology, 134(1-2):205–219, 2018

  23. [23]

    Relative contribution of land use change and climate variability on discharge of upper mara river, kenya.Journal of Hydrology: Regional Studies, 5:244–260, 2016

    Hosea M Mwangi, Stefan Julich, Sopan D Patil, Morag A McDonald, and Karl-Heinz Feger. Relative contribution of land use change and climate variability on discharge of upper mara river, kenya.Journal of Hydrology: Regional Studies, 5:244–260, 2016

  24. [24]

    Short term reservoirs operation on the seine river: Performance analysis of tree-based model predictive control

    Andrea Ficchi, Luciano Raso, Pierre-Olivier Malaterre, David Dorchies, Maxime Jay-Allemand, Francesca Pianosi, Peter-Jules van Overloop, and Guillaume Thirel. Short term reservoirs operation on the seine river: Performance analysis of tree-based model predictive control. 2014

  25. [25]

    Optimization of cnn through novel training strategy for visual classification problems

    Sadaqat Rehman, Shanshan Tu, Obaid Rehman, Yongfeng Huang, Chathura Magurawalage, Chin-Chen Chang, et al. Optimization of cnn through novel training strategy for visual classification problems. Entropy, 20(4):290, 2018

  26. [26]

    Csfl: A novel unsupervised convolution neural network approach for visual pattern classification

    Sadaqat ur Rehman, Shanshan Tu, Yongfeng Huang, Guojie Liu, et al. Csfl: A novel unsupervised convolution neural network approach for visual pattern classification. AI Communications, 30(5):311–324, 2017

  27. [27]

    A theoretical analysis of feature pooling in visual recognition

    Y-Lan Boureau, Jean Ponce, and Yann LeCun. A theoretical analysis of feature pooling in visual recognition. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 111–118, 2010

  28. [28]

    The vanishing gradient problem during learning recurrent neural nets and problem solutions

    Sepp Hochreiter. The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 6(02):107–116, 1998

  29. [29]

    Long short-term memory

    Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997

  30. [30]

    Visualizing and Understanding Convolutional Networks

    Matthew D Zeiler and Rob Fergus. Visualizing and understanding convolutional networks (2013). arXiv preprint arXiv:1311.2901, 2013

  31. [31]

    Least squares support vector machine classifiers.Neural processing letters, 9(3):293–300, 1999

    Johan AK Suykens and Joos Vandewalle. Least squares support vector machine classifiers.Neural processing letters, 9(3):293–300, 1999

  32. [32]

    Ordinal logistic regression

    Frank E Harrell. Ordinal logistic regression. In Regression modeling strategies, pages 311–325. Springer, 2015. 13 Water Preservation in Soan River Basin Using Deep Learning Techniques A PREPRINT

  33. [33]

    Stochastic gradient boosting

    Jerome H Friedman. Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4):367–378, 2002

  34. [34]

    Classification and regression by randomforest

    Andy Liaw, Matthew Wiener, et al. Classification and regression by randomforest. R news, 2(3):18–22, 2002

  35. [35]

    Comparison and validation of statistical methods for predicting power outage durations in the event of hurricanes

    Roshanak Nateghi, Seth D Guikema, and Steven M Quiring. Comparison and validation of statistical methods for predicting power outage durations in the event of hurricanes. Risk Analysis: An International Journal , 31(12):1897–1906, 2011

  36. [36]

    Statistical analysis of the effective- ness of seawalls and coastal forests in mitigating tsunami impacts in iwate and miyagi prefectures

    Roshanak Nateghi, Jeremy D Bricker, Seth D Guikema, and Akane Bessho. Statistical analysis of the effective- ness of seawalls and coastal forests in mitigating tsunami impacts in iwate and miyagi prefectures. PloS one, 11(8):e0158375, 2016

  37. [37]

    Human pressures and ecological status of european rivers

    B Grizzetti, A Pistocchi, C Liquete, A Udias, F Bouraoui, and W Van De Bund. Human pressures and ecological status of european rivers. Scientific reports, 7(1):205, 2017

  38. [38]

    Varying sensitivity of mountainous streamwater base-flow no 3- concentrations to n deposition in the northern suburbs of tokyo

    Kazuya Nishina, Mirai Watanabe, Masami K Koshikawa, Takejiro Takamatsu, Yu Morino, Tatsuya Nagashima, Kunika Soma, and Seiji Hayashi. Varying sensitivity of mountainous streamwater base-flow no 3- concentrations to n deposition in the northern suburbs of tokyo. Scientific reports, 7(1):7701, 2017

  39. [39]

    Diagnosing ancient monuments with expert software

    Stefano Lancini, Marco Lazzari, Alberto Masera, and Paolo Salvaneschi. Diagnosing ancient monuments with expert software. Structural engineering international, 7(4):288–291, 1997. 14