A Resource-Efficient Hybrid CNN-LSTM network for image-based bean leaf disease classification
Pith reviewed 2026-05-10 13:01 UTC · model grok-4.3
The pith
A hybrid CNN-LSTM architecture classifies bean leaf diseases at 94.38% accuracy with a model size of only 1.86 MB.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By integrating an LSTM layer to model the spatial-sequential relationships within feature maps, the hybrid architecture achieves a 94.38% accuracy while maintaining an exceptionally small footprint of 1.86 MB, a 70% reduction in size compared to traditional CNN-based systems, and state-of-the-art F1 scores of 99.22% with EfficientNet-B7+LSTM on the ibean dataset.
What carries the argument
LSTM layer integrated after CNN feature extraction to model spatial-sequential relationships within the feature maps.
If this is right
- Enables real-time agricultural decision support in resource-constrained environments.
- Tailored image augmentations outperform generic combinations for preserving diagnostic patterns.
- Small model size supports deployment on portable devices for on-site diagnosis.
- EfficientNet-B7 combined with LSTM reaches top F1 performance on bean leaf tasks.
Where Pith is reading between the lines
- The hybrid structure may improve efficiency for image-based disease classification in other crops.
- Domain-specific augmentation choices indicate that general augmentation tools often fall short in plant pathology.
- Memory reduction could make AI diagnosis accessible to farms with basic hardware.
- Further tests across seasons and bean varieties would check reliability outside the original dataset.
Load-bearing premise
The ibean dataset together with the selected image augmentations sufficiently represent real-world variability in bean leaf appearance, lighting, and disease presentation.
What would settle it
A significant drop in classification accuracy when the model is tested on a fresh set of bean leaf images collected from different locations or under new lighting and growth conditions.
Figures
read the original abstract
Accurate and resource-efficient automated diagnosis is a cornerstone of modern agricultural expert systems. While Convolutional Neural Networks (CNNs) have established benchmarks in plant pathology, their ability to capture long-range spatial dependencies is often limited by standard pooling layers, and their high memory footprint hinders deployment on portable devices. This paper proposes a lightweight hybrid CNN-LSTM system for bean leaf disease classification. By integrating an LSTM layer to model the spatial-sequential relationships within feature maps, our hybrid architecture achieves a 94.38% accuracy while maintaining an exceptionally small footprint of 1.86 MB; a 70% reduction in size compared to traditional CNN-based systems. Furthermore, we provide a systematic evaluation of image augmentation strategies, demonstrating that tailored transformations are superior to generic combinations for maintaining the integrity of diagnostic patterns. Results on the $\textit{ibean}$ dataset confirm that the proposed system achieves state-of-the-art F1 scores of 99.22% with EfficientNet-B7+LSTM, providing a robust and scalable framework for real-time agricultural decision support in resource-constrained environments. The code and augmented datasets used in this study are publicly available on this $\href{https://github.com/HJin-R/bean_disease}{Github}$ repo.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a lightweight hybrid CNN-LSTM architecture for classifying bean leaf diseases on the ibean dataset. It claims that adding an LSTM layer to model spatial-sequential relationships in CNN feature maps yields 94.38% accuracy at 1.86 MB model size (70% smaller than traditional CNNs), state-of-the-art F1 scores of 99.22% when paired with EfficientNet-B7, and superior results from tailored image augmentations. Public code and augmented datasets are provided.
Significance. If the empirical claims hold after verification, this would offer a practical advance in resource-efficient models for real-time plant disease diagnosis on portable devices, addressing deployment constraints in agriculture. Public code availability supports reproducibility and potential follow-on work.
major comments (3)
- [Abstract] Abstract: The central claim attributes performance gains and the 1.86 MB size to the LSTM modeling 'spatial-sequential relationships within feature maps.' Feature maps are 2D (H×W×C), yet no description is given of the required reshaping, flattening (row/column/patch-wise), or projection step to produce LSTM sequences. Without this or an ablation isolating LSTM contribution versus the base CNN, the hybrid mechanism and size benefit cannot be verified as load-bearing.
- [Abstract] Abstract: The 70% size reduction is stated relative to 'traditional CNN-based systems' with no named baselines, their reported sizes, or calculation details (e.g., parameter count vs. memory footprint). This directly undermines the resource-efficiency claim that is central to the paper's contribution.
- [Abstract] Abstract: State-of-the-art F1 (99.22% with EfficientNet-B7+LSTM) and accuracy (94.38%) claims lack any mention of train-test splits, number of runs, statistical tests, or direct comparisons to other models on identical splits. These omissions make the empirical results unverifiable and affect soundness of the hybrid architecture evaluation.
minor comments (1)
- [Abstract] Abstract: The mention of 'systematic evaluation of image augmentation strategies' would benefit from a one-sentence summary of the key tailored transformations and their measured impact to strengthen the abstract.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point by point below and indicate where revisions will be made to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim attributes performance gains and the 1.86 MB size to the LSTM modeling 'spatial-sequential relationships within feature maps.' Feature maps are 2D (H×W×C), yet no description is given of the required reshaping, flattening (row/column/patch-wise), or projection step to produce LSTM sequences. Without this or an ablation isolating LSTM contribution versus the base CNN, the hybrid mechanism and size benefit cannot be verified as load-bearing.
Authors: We agree that the reshaping mechanism and its contribution require explicit clarification. In the revised manuscript, we will add a precise description in the methods section explaining that the 2D feature maps are flattened row-wise into sequences (each spatial row treated as a time step) with a linear projection to match LSTM input dimensions. We will also include a new ablation study comparing the full hybrid CNN-LSTM against the base CNN without the LSTM layer, reporting accuracy, F1, and model size to isolate the LSTM's role. revision: yes
-
Referee: [Abstract] Abstract: The 70% size reduction is stated relative to 'traditional CNN-based systems' with no named baselines, their reported sizes, or calculation details (e.g., parameter count vs. memory footprint). This directly undermines the resource-efficiency claim that is central to the paper's contribution.
Authors: We acknowledge the need for concrete baselines and methodology. The revision will name specific traditional CNN models (ResNet50 and VGG16) used for comparison, report their sizes in MB, and detail the 70% reduction calculation based on total parameter counts converted to memory footprint (float32 precision). These will be presented in a new comparison table in the results section. revision: yes
-
Referee: [Abstract] Abstract: State-of-the-art F1 (99.22% with EfficientNet-B7+LSTM) and accuracy (94.38%) claims lack any mention of train-test splits, number of runs, statistical tests, or direct comparisons to other models on identical splits. These omissions make the empirical results unverifiable and affect soundness of the hybrid architecture evaluation.
Authors: We agree these details are essential for verifiability. The revised manuscript will explicitly state the train-test split ratio, report results from multiple independent runs with mean and standard deviation, and include direct comparisons to other models on identical splits. Formal statistical tests were not performed, but variance across runs will be reported to support reliability. The public code repository already enables exact reproduction of the splits and experiments. revision: partial
Circularity Check
No circularity; empirical results on external public dataset
full rationale
The paper reports direct empirical measurements of accuracy (94.38%), model size (1.86 MB), and F1 scores from training a hybrid CNN-LSTM on the ibean dataset with stated code availability. No mathematical derivations, predictions, or first-principles results are claimed that reduce to quantities defined by the authors' own fitted parameters, self-citations, or ansatzes. The architecture description and augmentation evaluation are implementation choices evaluated against external benchmarks, with no load-bearing self-citation chains or self-definitional steps present.
Axiom & Free-Parameter Ledger
free parameters (1)
- model hyperparameters and augmentation parameters
axioms (1)
- domain assumption The ibean dataset provides a sufficient and unbiased benchmark for evaluating bean leaf disease classification performance.
Reference graph
Works this paper leans on
- [1]
-
[2]
M. Venbrux, S. Crauwels, H. Rediers, Current and emerging trends in techniques for plant pathogen detection, Frontiers in Plant Science 14 (2023).doi:10.3389/fpls.2023.1120968. URLhttps://doi.org/10.3389/fpls.2023.1120968
-
[3]
A.-K. Mahlein, Plant disease detection by imaging sensors–parallels and specific demands for precision agriculture and plant pheno- typing, Plant Disease 100 (2) (2016) 241–251. doi:10.1094/ PDIS-03-15-0340-FE. URLhttps://doi.org/10.1094/PDIS-03-15-0340-FE
-
[4]
In: Proceedings of the 3rd International Conference on Smart Data Intelligence (ICSMDI), pp
L. Rahunathan, D. Sivabalaselvamani, E. Elakkiya, M. Madhumitha, K. Kumaresh, Recognition of bean leaf diseases using neural net- work and machine learning techniques, in: 2023 3rd International Conference on Smart Data Intelligence (ICSMDI), 2023, pp. 520–526. doi:10.1109/ICSMDI57622.2023.00098
-
[5]
H. Slimani, Artificial Intelligence-based Detection of Fava Bean Rust Disease in Agricultural Settings: An Innovative Approach, International Journal of Advanced Computer Science & Applications 14 (6) (2023) 119–128.doi:doi.org/10.14569/IJACS
-
[6]
Z. Islam, M. Islam, A. Amanullah, A combined deep CNN-LSTM network for the detection of novel coronavirus (COVID-19) using X-ray images, Informatics in Medicine Unlocked 20 (2020) 100412– 100412.doi:doi.org/10.1016/j.imu.2020.100412
-
[7]
J.Donahue,L.A.Hendricks,M.Rohrbach,S.Venugopalan,S.Guadar- rama, K. Saenko, T. Darrell, Long-term recurrent convolutional networks for visual recognition and description, IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (4) (2017) 677–691. doi:10.1109/TPAMI.2016.2599174
-
[8]
E. Önler, Feature fusion based artificial neural network model for disease detection of bean leaves., Electronic Research Archive 31 (5) (2023)
work page 2023
-
[9]
E. Elfatimi, R.Eryigit,L. Elfatimi, Beans LeafDiseases Classification Using MobileNet Models, IEEE Access 10 (2022) 9471–9482.doi: 10.1109/ACCESS.2022.3142817
- [10]
-
[11]
Z. Jian, Z. Wei, Support vector machine for recognition of cucumber leaf diseases, in: 2010 2nd International Conference on Advanced Computer Control, Vol. 5, 2010, pp. 264–266.doi:10.1109/ICACC. 2010.5487242
-
[12]
Y. Lu, S. Yi, N. Zeng, Y. Liu, Y. Zhang, Identification of rice diseases using deep convolutional neural networks, Neurocomputing 267 (2017) 378–384.doi:https://doi.org/10.1016/j.neucom.2017.06.023. URL https://www.sciencedirect.com/science/article/pii/ S0925231217311384
-
[13]
G. Geetharamani, J. Arun Pandian, Identification of plant leaf diseases using a nine-layer deep convolutional neural network, Computers & Electrical Engineering 76 (2019) 323–338. doi:https://doi.org/10.1016/j.compeleceng.2019.04.011. URL https://www.sciencedirect.com/science/article/pii/ S0045790619300023
- [14]
-
[15]
doi:doi.org/10.1007/s42979-023-02245-7
M.A.Patil,M.Manohar,Plantleafdiseaseclassificationusingoptimal tuned hybrid lstm-cnn model., SN Computer Science 4 (6) (2023) 710. doi:doi.org/10.1007/s42979-023-02245-7
-
[16]
E. Devi, S. Gopi, U. Padmavathi, S. R. Arumugam, S. Premnath, D.Muralitharan,Plantdiseaseclassificationusingcnn-lstmtechniques, HJ Rhee and J Akinyemi:Preprint submitted to ElsevierPage 10 of 12 Hybrid CNN-LSTM network for image-based bean leaf disease classification in: 2023 5th International Conference on Smart Systems andInventive Technology(ICSSIT),20...
-
[17]
M. A. Haque, C. K. Deb, P. Gole, S. Karmakar, A. Dheeraj, M. U. Din Shah, S. Dutta, M. K. P. Kumar, S. Marwaha, An enhanced vision transformer network for efficient and accurate crop disease detection, Expert Systems with Applications 283 (2025) 127743. doi:https://doi.org/10.1016/j.eswa.2025.127743. URL https://www.sciencedirect.com/science/article/pii/ ...
-
[18]
doi:https://doi.org/10.1016/j.compag.2021.106125
A.Abade,P.A.Ferreira,F.deBarrosVidal,Plantdiseasesrecognition on images using convolutional neural networks: A systematic review, Computers and Electronics in Agriculture 185 (2021) 106125. doi:https://doi.org/10.1016/j.compag.2021.106125. URL https://www.sciencedirect.com/science/article/pii/ S0168169921001435
-
[19]
1394–1399.doi:10.1109/I-SMAC61858.2024.10714612
S.Singla,R.Gupta,DeepLearningbasedBeanLeafLesionClassifica- tion utilizing EfficientNetV2-S, in: 2024 8th International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2024, pp. 1394–1399.doi:10.1109/I-SMAC61858.2024.10714612
-
[20]
D.-C. Rodríguez-Lira, D.-M. Córdova-Esparza, J. M. Álvarez Al- varado, J.-A. Romero-González, J. Terven, J. Rodríguez-Reséndiz, Comparative Analysis of YOLO Models for Bean Leaf Disease Detection in Natural Environments, AgriEngineering 6 (4) (2024) 4585–4603.doi:10.3390/agriengineering6040262. URLhttps://www.mdpi.com/2624-7402/6/4/262
-
[21]
F. Hohman, M. B. Kery, D. Ren, D. Moritz, Model Compression in Practice: Lessons Learned from Practitioners Creating On-device MachineLearningExperiences,in:ProceedingsoftheCHIConference on Human Factors in Computing Systems, CHI ’24, ACM, 2024, pp. 1–18.doi:10.1145/3613904.3642109. URLhttp://dx.doi.org/10.1145/3613904.3642109
-
[22]
H. Sun, H. Xu, B. Liu, D. He, J. He, H. Zhang, N. Geng, MEAN-SSD: A novel real-time detector for apple leaf diseases using improved light-weight convolutional neural networks, Computers and Electronics in Agriculture 189 (2021) 106379. doi:https://doi.org/10.1016/j.compag.2021.106379. URL https://www.sciencedirect.com/science/article/pii/ S0168169921003963
-
[23]
Solving Current Limitations of Deep Learning Based Approaches for Plant Disease Detection,
M.Arsenovic,M.Karanovic,S.Sladojevic,A.Anderla,D.Stefanovic, Solvingcurrentlimitationsofdeeplearningbasedapproachesforplant disease detection, Symmetry 11 (7) (2019).doi:10.3390/sym11070939. URLhttps://www.mdpi.com/2073-8994/11/7/939
- [24]
-
[25]
L. Taylor, G. Nitschke, Improving deep learning with generic data augmentation, in: 2018 IEEE Symposium Series on Computational Intelligence (SSCI), 2018, pp. 1542–1547.doi:10.1109/SSCI.2018. 8628742
-
[26]
C.Shorten,T.M.Khoshgoftaar,Asurveyonimagedataaugmentation for deep learning, Journal of Big Data 6 (60) (2019).doi:https: //doi.org/10.1186/s40537-019-0197-0
-
[27]
AI-Lab-Makerere / ibean, https://github.com/AI-Lab-Makerere/ibean/ (Jan. 2020)
work page 2020
-
[28]
Metallurgy and Design of Alloys with Hierarchical Microstructures
A.Muimba-Kankolongo, Food Crop Production by Smallholder Farmers in Southern Africa, Science Direct, 2018. URL https://www.sciencedirect.com/book/9780128143834/ food-crop-production-by-smallholder-farmers-in-southern-africa
-
[29]
L. Deng, J. C. Platt, Ensemble deep learnig for speech recognition, Interspeech 1 (2014).doi:doi:10.21437/Interspeech.2014-433
-
[30]
T. N. Sainath, O. Vinyals, A. Senior, H. Sak, Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks, in: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP),2015,pp.4580–4584. doi:10.1109/ICASSP.2015. 7178838
-
[31]
G. Ercolano, S. Rossi, Combining CNN and LSTM for activity of daily living recognition with a 3D matrix skeleton representation, IntelligentServiceRobotics14(2021)175–185. doi:doi.org/10.1007/ s11370-021-00358-7
work page 2021
-
[32]
Very Deep Convolutional Networks for Large-Scale Image Recognition
K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in: Y. Bengio, Y. LeCun (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URLhttp://arxiv.org/abs/1409.1556
work page internal anchor Pith review Pith/arXiv arXiv 2015
- [33]
-
[34]
M. S. Nixon, A. A. Aguado, Feature Extraction and Image Processing for Computer Vision,, Academic Press, London, 2020
work page 2020
-
[35]
K. He, X. Zhang, S. Ren, J. Sun, Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, in: 2015 IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1026–1034.doi:10.1109/ICCV.2015.123
-
[36]
S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Computation 9 (8) (1997) 1735–1780.arXiv:https://direct.mit. edu/neco/article-pdf/9/8/1735/813796/neco.1997.9.8.1735.pdf, doi: 10.1162/neco.1997.9.8.1735. URLhttps://doi.org/10.1162/neco.1997.9.8.1735
-
[37]
A.Tharwat,Classificationassessmentmethods,AppliedComputing& Informatics 17 (1) (2021) 168–192.doi:doi.org/10.1016/j.aci.2018. 08.003
- [38]
-
[39]
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization, in: 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618–626.doi:10.1109/ICCV. 2017.74
- [40]
-
[41]
D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980 (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[42]
S. H. Abed, A. S. Al-Waisy, H. J. Mohammed, S. Al-Fahdawi, A modern deep learning framework in robot vision for automated bean leaves diseases detection, International Journal of Intelligent Robotics and Applications 5 (2) (2021) 235–251
work page 2021
- [43]
-
[44]
A. Sunyoto, D. Ariatmanto, et al., Innovative solutions for bean leaf disease detection using deep learning, in: 2024 IEEE International Conference on Artificial Intelligence and Mechatronics Systems (AIMS), IEEE, 2024, pp. 1–5
work page 2024
-
[45]
E. Jain, A. Aneja, Automated detection and classification of bean leaf diseases using inceptionv3: A deep learning approach, in: 2025 International Conference on Electronics and Renewable Systems (ICEARS), 2025, pp. 1890–1895. doi:10.1109/ICEARS64219.2025. 10941547
-
[46]
R. Karthik, R. Aswin, K. S. Geetha, K. Suganthi, An explainable deep learning network with transformer and custom cnn for bean leaf disease classification, IEEE Access 13 (2025) 38562–38573. doi:10.1109/ACCESS.2025.3546017
-
[47]
Efficient attention: Attention with linear complexities
K. Kahatapitiya, R. Rodrigo, Exploiting the Redundancy in Con- volutional Filters for Parameter Reduction, in: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), 2021, pp. 1409–1419.doi:10.1109/WACV48630.2021.00145
-
[48]
J. G. A. Barbedo, A review on the main challenges in automatic plant disease identification based on visible range images, Biosystems Engineering 144 (2016) 52–60. doi:https://doi.org/10.1016/j.biosystemseng.2016.01.017. URL https://www.sciencedirect.com/science/article/pii/ HJ Rhee and J Akinyemi:Preprint submitted to ElsevierPage 11 of 12 Hybrid CNN-LST...
-
[49]
G. Fenu, F. M. Malloci, DiaMOS Plant: A Dataset for Diagnosis and Monitoring Plant Disease, Agronomy 11 (11) (2021). doi: 10.3390/agronomy11112107. URLhttps://www.mdpi.com/2073-4395/11/11/2107 HJ Rhee and J Akinyemi:Preprint submitted to ElsevierPage 12 of 12
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.