Non-Destructive Prediction of Fruit Ripeness and Firmness Using Hyperspectral Imaging and Lightweight Machine Learning Models
Pith reviewed 2026-05-13 17:02 UTC · model grok-4.3
The pith
Tree-based machine learning models outperform deep learning for fruit ripeness and firmness prediction while needing only three wavelengths.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Tree-based machine learning models can outperform state-of-the-art deep learning models on hyperspectral imaging data for simultaneous ripeness classification and firmness prediction across five fruit species. Only three visible-range wavelengths recover over 94 percent of full-spectrum accuracy, showing that low-cost multispectral sensors with lightweight models offer practical alternatives to expensive cameras and complex deep learning.
What carries the argument
Tree-based classifiers operating on hyperspectral spectra after preprocessing and spectral transformation, with a key reduction to three visible wavelengths.
If this is right
- Classical machine learning can replace deep learning in agricultural hyperspectral applications, lowering computational demands.
- Multispectral sensors with three channels become viable for commercial fruit sorting systems.
- Preprocessing techniques such as class balancing are essential and should be prioritized in model development.
- Non-destructive quality assessment becomes accessible without specialized hardware or expertise.
Where Pith is reading between the lines
- These wavelength selections might transfer to ripeness monitoring in other produce or agricultural products.
- Integration with smartphone-based imaging could create consumer tools for checking fruit quality at purchase.
- Validation in field conditions with variable lighting would strengthen the case for real-world adoption.
Load-bearing premise
The performance observed on five specific fruit species with the selected preprocessing will generalize to additional varieties and different real-world lighting and sorting conditions.
What would settle it
Applying the three-wavelength tree model to a sixth fruit species or under commercial lighting yields accuracy below 85 percent of the full-spectrum baseline.
Figures
read the original abstract
Post-harvest fruit quality assessment is essential for reducing food waste, yet reliable non-destructive methods typically depend on expensive hyperspectral cameras and computationally intensive deep learning models. These systems typically require GPU resources, large-scale training data, and domain expertise, limiting their feasibility for many real-world agricultural settings. This study systematically evaluates 20 classical machine learning algorithms on hyperspectral imaging data for simultaneous ripeness classification and firmness prediction across five fruit species, using cross-validated experimental design with Bayesian hyperparameter optimization. Data preprocessing strategy, particularly class balancing and spectral transformations, contributes as much to prediction accuracy as algorithm choice. Our results show that tree-based machine learning models can outperform state-of-the-art deep earning models reported in Fruit-HSNet. Moreover, the findings indicate that only three visible-range wavelengths are needed to recover over 94% of full-spectrum accuracy, demonstrating that low-cost multispectral sensors combined with lightweight machine learning models can serve as practical alternatives to expensive hyperspectral cameras and complex deep learning approaches for practical fruit quality sorting.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript evaluates 20 classical machine learning algorithms on hyperspectral imaging data for simultaneous ripeness classification and firmness prediction across five fruit species. Using cross-validation and Bayesian hyperparameter optimization, it claims that tree-based models outperform the deep learning models reported in Fruit-HSNet and that only three visible-range wavelengths recover over 94% of full-spectrum accuracy, positioning low-cost multispectral sensors with lightweight models as practical alternatives to hyperspectral cameras and complex deep learning.
Significance. If the performance claims are substantiated with verifiable metrics and protocol equivalence, the work could support accessible, low-cost fruit quality assessment systems that reduce food waste without requiring GPUs or large datasets. The finding that preprocessing contributes comparably to algorithm choice and the emphasis on visible-range wavelengths are potentially useful for practical deployment.
major comments (3)
- [Abstract / Results] Abstract and Results section: The central claim that three visible-range wavelengths recover over 94% of full-spectrum accuracy is presented without dataset sizes, exact performance metrics (e.g., accuracy or RMSE values for full vs. reduced spectra), error bars, or details on how the 94% figure was computed. This prevents verification of the wavelength-reduction result.
- [Abstract / Discussion] Comparison to Fruit-HSNet (Abstract and Discussion): The claim that tree-based models outperform Fruit-HSNet models requires explicit confirmation that the five-fruit dataset, class definitions, illumination conditions, and cross-validation protocol match those used in the reference work. Without this, the reported superiority may reflect differences in task difficulty rather than model merit.
- [Methods] Methods: The abstract states that preprocessing (class balancing and spectral transforms) contributes as much as algorithm choice, yet no ablation study isolates the effect of wavelength selection inside versus outside the cross-validation loop, which is necessary to support the reduced-sensor claim.
minor comments (2)
- [Abstract] Abstract: Typo 'deep earning models' should read 'deep learning models'.
- [Methods] The manuscript should report the number of samples per fruit species and the exact cross-validation scheme (e.g., k-fold, stratified) to allow reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback, which has helped us improve the clarity and rigor of the manuscript. We address each major comment below and have revised the paper accordingly to provide the requested details and verifications.
read point-by-point responses
-
Referee: [Abstract / Results] Abstract and Results section: The central claim that three visible-range wavelengths recover over 94% of full-spectrum accuracy is presented without dataset sizes, exact performance metrics (e.g., accuracy or RMSE values for full vs. reduced spectra), error bars, or details on how the 94% figure was computed. This prevents verification of the wavelength-reduction result.
Authors: We agree that the original abstract and Results section lacked the quantitative details needed for verification. In the revised manuscript, we have added dataset sizes (e.g., 1200 samples for apples, 950 for bananas, etc.), exact metrics including full-spectrum accuracy of 96.8% (SD 1.2%) vs. reduced-spectrum 91.4% (SD 1.4%) for ripeness classification and corresponding RMSE values for firmness, with error bars from 5-fold CV. The 94% recovery is computed as the average relative performance (reduced/full) across all tasks and fruits; a new table (Table 3) and expanded Results text now report these values explicitly. revision: yes
-
Referee: [Abstract / Discussion] Comparison to Fruit-HSNet (Abstract and Discussion): The claim that tree-based models outperform Fruit-HSNet models requires explicit confirmation that the five-fruit dataset, class definitions, illumination conditions, and cross-validation protocol match those used in the reference work. Without this, the reported superiority may reflect differences in task difficulty rather than model merit.
Authors: We confirm that the experiments used the identical five-fruit hyperspectral dataset, ripeness class definitions (unripe/ripe/overripe), firmness regression targets, and illumination/imaging setup as Fruit-HSNet. The 5-fold cross-validation protocol was replicated exactly. To address this, we have added a Methods subsection on dataset equivalence and a new comparison table (Table 4) directly juxtaposing our tree-based results against the Fruit-HSNet deep learning metrics under matched conditions. revision: yes
-
Referee: [Methods] Methods: The abstract states that preprocessing (class balancing and spectral transforms) contributes as much as algorithm choice, yet no ablation study isolates the effect of wavelength selection inside versus outside the cross-validation loop, which is necessary to support the reduced-sensor claim.
Authors: We agree that an explicit ablation isolating wavelength selection inside the CV loop is required to rule out leakage. In the revised Methods and Results, we have added this ablation: feature selection (via mutual information) was performed strictly within each training fold, and we report performance differences when selection occurs inside vs. outside the loop. The reduced-sensor results remain robust (92.1% relative accuracy) under the proper nested protocol, and this is now documented with a dedicated figure and text. revision: yes
Circularity Check
No circularity: purely empirical ML evaluation with cross-validation
full rationale
The paper reports results from training and cross-validating 20 classical ML models on hyperspectral fruit data, with Bayesian hyperparameter tuning and preprocessing steps. Claims of tree-based outperformance versus Fruit-HSNet and 94% recovery with three wavelengths are experimental outcomes, not derivations that reduce to fitted parameters or self-citations by construction. No equations, ansatzes, or uniqueness theorems are invoked that equate outputs to inputs. The work is self-contained data-driven analysis; any comparability issues with prior work are external validity concerns, not circularity.
Axiom & Free-Parameter Ledger
free parameters (1)
- model hyperparameters
axioms (1)
- domain assumption Hyperspectral reflectance measurements contain sufficient information to predict fruit ripeness and firmness
Reference graph
Works this paper leans on
-
[1]
Early decay detection in fruit by hyperspectral imaging–principles and application potential. Food Control 152, 109830. doi:10.1016/j. foodcont.2023.109830. Nagasubramanian, K., Jones, S., Sarkar, S., Singh, A.K., Singh, A., Gana- pathysubramanian, B., 2018. Hyperspectral band selection using genetic 66 algorithm and support vector machines for early iden...
work page doi:10.1016/j 2023
-
[2]
doi:10.3390/app13179740. Wieme, J., Mollazade, K., Malounas, I., Zude-Sasse, M., Zhao, M., Gowen, A., Argyropoulos, D., Fountas, S., Van Beek, J., 2022. Application of hyperspectral imaging systems and artificial intelligence for quality assess- ment of fruit, vegetables and mushrooms: A review. Biosystems Engineer- ing 222, 156–176. doi:10.1016/j.biosyst...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.