Curvature-aware 3D length estimation of greenhouse cucumbers using RGB-D imaging and cubic spline arc-length integration

Manveen Kaur; Rajmeet Singh; Saeed Mozaffri; Shahpour Alirezaee

arxiv: 2606.22439 · v1 · pith:MO3IKNHGnew · submitted 2026-06-21 · 💻 cs.CV · cs.RO

Curvature-aware 3D length estimation of greenhouse cucumbers using RGB-D imaging and cubic spline arc-length integration

Manveen Kaur , Rajmeet Singh , Saeed Mozaffri , Shahpour Alirezaee This is my paper

Pith reviewed 2026-06-26 10:58 UTC · model grok-4.3

classification 💻 cs.CV cs.RO

keywords cucumber length estimationRGB-D imagingmedial axis splinearc length integrationinstance segmentationgreenhouse automationcubic splineYOLO SAM

0 comments

The pith

Cubic spline fitted to the 3D medial axis estimates cucumber length with 4.13% MAPE.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a non-contact RGB-D system to measure greenhouse cucumber lengths at commercial scale, where manual thread measurements are accurate but too slow. It segments fruit with YOLO and SAM, then compares five 3D length methods on the same 48 captures from seven cucumbers. The medial arc spline approach, which fits a cubic spline to the medial axis points and integrates arc length by trapezoidal rule, records the lowest error and beats the other four methods at the corrected significance level. A side finding shows that depth-stream intrinsics after colour alignment produce 12-18% systematic underestimation.

Core claim

The novel medial arc spline method fits a cubic spline through the 3D medial axis of the SAM-refined mask and computes arc length by trapezoidal integration, delivering 4.13% MAPE on the benchmark and statistically outperforming the dominant-axis, PCA, medial-axis skeleton, and keypoint-guided baselines.

What carries the argument

Medial arc spline: cubic spline fitted to the 3D medial-axis points extracted from the instance mask, with length obtained by numerical integration of the resulting curve.

If this is right

Greenhouse operations can replace manual length checks for harvest scheduling, labour planning, and grading.
The pipeline achieves real-time performance with 100% coverage through adaptive method selection on a single consumer GPU.
Any RGB-D pipeline using rs.align to colour stream must correct for the 12-18% length underestimation caused by mismatched intrinsics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same spline integration on medial axes could be tested on other elongated curved produce such as zucchini or peppers.
Adding a larger and more diverse capture set would test whether the reported accuracy hierarchy holds outside the original seven fruits.
The length estimates could feed directly into robotic harvester control loops for automated picking decisions.

Load-bearing premise

Thread-based ground-truth lengths are accurate and the 48 captures from seven cucumbers in three size classes represent the shape variation and imaging conditions of commercial greenhouse production.

What would settle it

A new benchmark on at least 50 additional cucumbers under varied greenhouse lighting and camera distances that shows the medial arc spline no longer achieves the lowest MAPE or loses statistical significance against the other methods.

Figures

Figures reproduced from arXiv: 2606.22439 by Manveen Kaur, Rajmeet Singh, Saeed Mozaffri, Shahpour Alirezaee.

**Figure 1.** Figure 1: End-to-end pipeline: D435 burst capture → YOLO26n detection → SAM mask refinement → adaptive method selection (M1–M5) → annotated length output [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

**Figure 2.** Figure 2: Two-stage pipeline: YOLO26n backbone–neck–head with C3k2 blocks, FPN+PAN, and decou [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of M1–M5. M1: scan-line (no mask, fast). M2: PCA (orientation-free). M3: SAM + [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: M1 geometric limitation: vertical cucumber measured correctly (left); tilted cucumber underesti [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: M2: stride-2 sampling (Step 1), SVD principal axis with endpoints [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: M3: SAM prompting with box+centre-point (Step 1), mask generation highest-IoU mask selected [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: M4: YOLO26n detection (Step 1), YOLO26-pose keypoints KP0–KP4 (Step 2), visibility and [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

**Figure 8.** Figure 8: M5 nine-step flowchart: YOLO26 detection, SAM segmentation, 3D cloud, SVD axis, cross-section [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗

**Figure 9.** Figure 9: Adaptive method-selection flowchart with cascading fallbacks from M5 (best) to M1 (fastest). [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗

**Figure 10.** Figure 10: Dataset samples: (a) laboratory, (b) greenhouse, (c) segmentation annotations, (d) keypoint [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗

**Figure 11.** Figure 11: Thread-based ground truth measurement protocol. Left: thread laid along dorsal midline and [PITH_FULL_IMAGE:figures/full_fig_p013_11.png] view at source ↗

**Figure 12.** Figure 12: YOLO26n training/validation over 300 epochs. Top: training losses and detection metrics. [PITH_FULL_IMAGE:figures/full_fig_p014_12.png] view at source ↗

**Figure 13.** Figure 13: Qualitative comparison: detection + SAM mask, depth map, measurement overlay, and 3D point [PITH_FULL_IMAGE:figures/full_fig_p014_13.png] view at source ↗

**Figure 14.** Figure 14: (a) Predicted vs. ground truth. (b) Per-capture tracking. (c) Error box-and-strip plots. Median: [PITH_FULL_IMAGE:figures/full_fig_p016_14.png] view at source ↗

**Figure 15.** Figure 15: Five-method comparison on new 5-cucumber validation set (GT 14.72–38.0 cm). Each row shows [PITH_FULL_IMAGE:figures/full_fig_p017_15.png] view at source ↗

**Figure 16.** Figure 16: CucumberVision dashboard. (a) Capture interface with 30 detected cucumbers and per-fruit [PITH_FULL_IMAGE:figures/full_fig_p018_16.png] view at source ↗

read the original abstract

Commercial greenhouse cucumber production is graded by fruit length, which drives harvest scheduling, labour allocation, and logistics. Manual measurement with thread or caliper is accurate but infeasible at commercial scale. This paper presents CucumberVision, a non-contact length estimation framework using an Intel RealSense D435 RGB-D camera. A YOLO26n instance segmentation model locates cucumbers, and SAM (ViT-B backbone) refines each detection to a pixel-precise mask. Five methods are evaluated under matched conditions: (M1) a dominant-axis skeleton scan-line baseline; (M2) PCA on the bounding-box depth point cloud; (M3) SAM mask with medial-axis skeletonisation; (M4) a hybrid keypoint-guided approach using a YOLO26-pose model predicting five anatomical landmarks (KP0--KP4) with piecewise 3D arc-length; and (M5) a novel medial arc spline method fitting a cubic spline through the 3D medial axis of the SAM mask and computing arc length by trapezoidal integration -- the first such application to elongated vegetable measurement. All methods share five-frame burst depth averaging, colour-stream intrinsic alignment, and adaptive method selection with cascading fallbacks ensuring 100% coverage. A benchmark of 48 captures across seven cucumbers in three size categories (small ~8 cm, medium ~13 cm, large ~25 cm) with thread-based ground truth establishes a significant accuracy hierarchy: M1 (MAPE 9.68%) > M2 (5.31%) > M4 (5.51%) > M3 (5.82%) > M5 (4.13%). M5 significantly outperforms all competitors at Bonferroni-corrected alpha=0.0125. A secondary contribution is identifying a 12--18% length underestimation caused by using depth-stream rather than colour-stream intrinsics after rs.align(rs.stream.color) -- an under-reported error source. The complete system is released open source and runs in real time on a single consumer-grade GPU.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The medial-arc spline method beats their baselines on 48 captures from 7 cucumbers, but the significance test ignores repeated measures and the sample is too small for strong claims.

read the letter

The main thing here is that their M5 approach—fitting a cubic spline to the 3D medial axis extracted from SAM masks and integrating arc length—gets the lowest error (4.13% MAPE) among the five methods they compare. The rest of the pipeline is standard YOLO plus RealSense stuff.

What is actually new is the specific combination of cubic-spline arc-length on the medial axis for this elongated-vegetable task; nothing in the cited work does exactly that. They also run a clean head-to-head on the same 48 captures, flag the 12-18% intrinsics error after rs.align, and ship the full code. That last part is useful.

The soft spots are the evaluation. Seven cucumbers total, even with three size classes and multiple views, is a narrow base for claiming a clear accuracy hierarchy. The stress-test point lands: the Bonferroni tests assume independent observations, but repeated captures on the same fruit share geometry and sensor placement, so the p-values are likely anti-conservative. No mixed-effects or clustering correction is mentioned. The thread ground truth is fine, but the benchmark does not yet show how the method holds up across more produce or real greenhouse variation.

This is for people working on practical RGB-D measurement in agriculture or similar niche CV tasks. A reader who needs a working pipeline or wants to see how spline integration compares to skeleton or keypoint methods will get concrete value from the comparisons and the released code. It has enough empirical grounding and reproducibility to deserve a serious referee rather than a desk reject.

I would send it out for review, but the referees will almost certainly ask for more data and a proper handling of the repeated measures.

Referee Report

1 major / 1 minor

Summary. The paper introduces CucumberVision, a non-contact RGB-D framework for estimating greenhouse cucumber lengths. It combines YOLO26n instance segmentation with SAM mask refinement, then evaluates five 3D length methods on matched data: a skeleton scan-line baseline (M1), PCA on depth point clouds (M2), medial-axis skeletonisation (M3), keypoint-guided piecewise arcs (M4), and a novel cubic-spline fit to the 3D medial axis with trapezoidal arc-length integration (M5). All methods use five-frame depth averaging and adaptive fallbacks. On a benchmark of 48 captures from seven cucumbers (three size classes) with thread ground truth, the paper reports MAPE values establishing the hierarchy M5 (4.13%) best, followed by M2, M4, M3, M1, with M5 significantly outperforming the others at Bonferroni-corrected α=0.0125. A secondary finding is 12–18% underestimation when using depth rather than colour intrinsics after alignment. The full system is released open-source and runs in real time.

Significance. If the reported accuracy ordering holds after statistical correction, the work offers a practical, scalable alternative to manual thread or caliper measurement for commercial cucumber grading. The open-source release, real-time performance on consumer GPUs, and explicit identification of the intrinsics mismatch constitute clear strengths that increase the manuscript’s utility to the RGB-D and agricultural-vision communities.

major comments (1)

[Results section (statistical comparison)] Results section (statistical comparison of the five methods): The claim that M5 significantly outperforms all competitors at Bonferroni-corrected α=0.0125 is based on treating the 48 captures as independent observations. With only seven cucumbers and multiple captures per fruit, the data constitute repeated measures; standard pairwise or ANOVA tests underlying the Bonferroni adjustment assume independence. Correlated errors within each cucumber (shared geometry and sensor pose) reduce effective degrees of freedom and can produce anti-conservative p-values, directly undermining the reported significance hierarchy.

minor comments (1)

[Abstract] Abstract: The listed accuracy hierarchy “M1 (MAPE 9.68%) > M2 (5.31%) > M4 (5.51%) > M3 (5.82%) > M5 (4.13%)” does not match the numerical order of the MAPE values; M2 (5.31%) is better than M4 (5.51%), yet the inequality symbols suggest the opposite ordering. Clarify whether the symbols denote error magnitude or method ranking.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on the statistical comparison. We agree that the repeated-measures structure (multiple captures per cucumber) violates the independence assumption of the original tests and will revise the analysis accordingly.

read point-by-point responses

Referee: Results section (statistical comparison of the five methods): The claim that M5 significantly outperforms all competitors at Bonferroni-corrected α=0.0125 is based on treating the 48 captures as independent observations. With only seven cucumbers and multiple captures per fruit, the data constitute repeated measures; standard pairwise or ANOVA tests underlying the Bonferroni adjustment assume independence. Correlated errors within each cucumber (shared geometry and sensor pose) reduce effective degrees of freedom and can produce anti-conservative p-values, directly undermining the reported significance hierarchy.

Authors: We fully agree that the 48 captures are repeated measures on only seven cucumbers and that the original pairwise tests (with Bonferroni correction) assume independence, which is not met. This is a valid concern that can inflate significance. We will revise the manuscript by (i) computing per-cucumber mean errors, (ii) applying a linear mixed-effects model with cucumber identity as a random effect and method as a fixed effect, and (iii) reporting the resulting p-values and effect sizes. The revised results section will qualify or remove the original significance claim if it does not hold under the mixed model. The open-source code will be updated to include the new analysis script. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical comparison to external thread ground truth

full rationale

The paper describes five length-estimation pipelines (M1–M5) and reports their MAPE on 48 RGB-D captures against independent thread measurements. No derivation, formula, or 'prediction' is presented whose output is algebraically identical to its inputs by construction. M5 is defined as cubic-spline arc-length integration on the medial axis; this is a standard numerical procedure, not a self-referential fit. The accuracy hierarchy is obtained by direct measurement against external ground truth, not by renaming or re-fitting quantities already present in the model equations. No self-citations are invoked as load-bearing uniqueness theorems. The statistical-independence concern raised by the skeptic is a question of experimental design validity, not a circularity in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard camera models, pre-trained segmentation networks, and the assumption that the medial axis extracted from the SAM mask faithfully represents the cucumber's central curve; no new free parameters are introduced in the arc-length integration itself.

axioms (2)

domain assumption The 3D medial axis computed from the SAM mask accurately traces the central curve of the cucumber
Load-bearing premise for the M5 spline fitting step
domain assumption Thread-based manual measurements constitute error-free ground truth
Required for all MAPE calculations and significance tests

pith-pipeline@v0.9.1-grok · 5923 in / 1416 out tokens · 32995 ms · 2026-06-26T10:58:55.606950+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · 1 internal anchor

[1]

doi:10.3390/agronomy11091780. S. Rijal, S. Pokhrel, M. Om, and V. Ojha. Comparing depth estimation of Azure Kinect and RealSense D435i cameras. InProceedings of the Ninth International Congress on Information and Communication Technology (ICICT), pages 491–500. Springer,

work page doi:10.3390/agronomy11091780
[2]

doi:10.1007/978-981-97-3588-4_42. W. Wang and C. Li. Size estimation of sweet onions using consumer-grade RGB-depth sensor.Journal of Food Engineering, 142:153–162,

work page doi:10.1007/978-981-97-3588-4_42
[3]

doi:10.1016/j.jfoodeng.2014.06.019. Z. Chen, Z. Wang, X. Li, J. Zhao, and W. Zhou. Vegetable size measurement based on stereo camera and keypoints detection.Sensors, 22(4):1617,

work page doi:10.1016/j.jfoodeng.2014.06.019 2014
[4]

doi:10.3390/s22041617. 20 A. Patel, Z. Liu, Y. Zhang, and W. Chen. Automated measurement of field crop phenotypic traits us- ing UAV 3D point clouds and an improved PointNet++.Frontiers in Plant Science, 16:1654232,

work page doi:10.3390/s22041617
[5]

doi:10.3389/fpls.2025.1654232. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W. Y. Lo, P. Dollár, and R. Girshick. Segment anything. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 4015–4026,

work page doi:10.3389/fpls.2025.1654232 2025
[6]

doi:10.1109/ICCV51070.2023.00371. L. Zhang, J. Wang, Q. Li, Y. Zhao, and S. Liu. Automatic fish body length measurement based on stereo vision and skeleton extraction.Computers and Electronics in Agriculture, 214:108305, 2023a. doi:10.1016/j.compag.2023.108305. S. Ren, L. Zhang, Z. Li, and T. Liu. Keypoint-based size estimation for irregular root vegetabl...

work page doi:10.1109/iccv51070.2023.00371 2023
[7]

doi:10.1016/j.compag.2024.108703. X. Liu, D. Zhao, W. Jia, W. Ji, C. Ruan, and Y. Sun. Cucumber fruits detection in greenhouses based on instance segmentation.IEEE Access, 7:139635–139642,

work page doi:10.1016/j.compag.2024.108703 2024
[8]

doi:10.1109/ACCESS.2019.2942144. O. M. Lawal. Real-time cucurbit fruit detection in greenhouse using improved YOLO series algorithm. Precision Agriculture, 25:347–359,

work page doi:10.1109/access.2019.2942144 2019
[9]

doi:10.1007/s11119-023-10079-7. A. Koirala, K. B. Walsh, Z. Wang, and C. McCarthy. In-orchard sizing of mango fruit:

work page doi:10.1007/s11119-023-10079-7
[10]

doi:10.3390/horticulturae8121223. P. Song, Z. Li, M. Yang, Y. Shao, Z. Pu, W. Yang, and R. Zhai. Dynamic detection of three-dimensional crop phenotypes based on a consumer-grade RGB-D camera.Frontiers in Plant Science, 14:1097725,

work page doi:10.3390/horticulturae8121223
[11]

doi:10.3389/fpls.2023.1097725. S. J. Hong, J. Kim, and A. Lee. Real-time morphological measurement of oriental melon fruit through multi- depth camera three-dimensional reconstruction.Food and Bioprocess Technology, 17:5038–5052,

work page doi:10.3389/fpls.2023.1097725 2023
[12]

doi:10.1007/s11947-024-03367-9. C. H. Türkseven, M. Jahanbanifard, A. Verma, and Z. A. Becer. Seedling-lump integrated non-destructive monitoring for automatic transplanting with Intel RealSense depth camera.Smart Agricultural Technology, 1:100015,

work page doi:10.1007/s11947-024-03367-9
[13]

Rajmeet Singh, Asim Khan, Lakmal Seneviratne, and Irfan Hussain

doi:10.1016/j.atech.2021.100015. Rajmeet Singh, Asim Khan, Lakmal Seneviratne, and Irfan Hussain. Deep learning approach for detecting tomato flowers and buds in greenhouses on 3p2r gantry robot.Scientific Reports, 14(1):20552,

work page doi:10.1016/j.atech.2021.100015 2021
[14]

doi:10.1109/CVPR.2016.91. C. Y. Wang, A. Bochkovskiy, and H. Y. M. Liao. YOLOv7-hv: Selective fruit harvesting pre- diction and 6D pose estimation.Computers and Electronics in Agriculture, 226:109362,

work page doi:10.1109/cvpr.2016.91 2016
[15]

Ultralytics

doi:10.1016/j.compag.2024.109362. Ultralytics. Ultralytics YOLO documentation.https://docs.ultralytics.com,

work page doi:10.1016/j.compag.2024.109362 2024
[16]

doi:10.1109/ICCV.2017.322. A. Carraro, M. Sozzi, and F. Marinello. The Segment Anything Model (SAM) for accelerating the smart farming revolution.Smart Agricultural Technology, 5:100292,

work page doi:10.1109/iccv.2017.322 2017
[17]

doi:10.1016/j.atech.2023.100292. 21 H. Williams, J. Pham, and L. He. Leaf only SAM: A segment anything pipeline for zero-shot automated leaf segmentation.Frontiers in Plant Science, 15:1373629,

work page doi:10.1016/j.atech.2023.100292 2023
[18]

doi:10.3389/fpls.2024.1373629. M. Kaur, R. Singh, S. Alirezaee, and I. Hussain. Visual-language transformer-based tomato leaf disease detection for portable greenhouse monitoring device.Plant Methods, 21(1):139,

work page doi:10.3389/fpls.2024.1373629 2024
[19]

doi:10.1186/s13007- 025-01339-w. N. Ravi, V. Gabeur, Y. T. Hu, R. Hu, C. Ryali, T. Ma, H. Khedr, R. Rädle, C. Rolland, L. Gustafson, E. Mintun, J. Pan, K. V. Alwala, N. Carion, C. Y. Wu, R. Girshick, P. Dollár, and C. Feichtenhofer. SAM 2: Segment anything in images and videos.https://arxiv.org/abs/2408.00714,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1186/s13007-
[20]

doi:10.1006/cgip.1994.1042. D. Wu, W. Wu, X. Luo, and M. Li. A high-throughput phenotyping pipeline for image analysis of rice panicle architecture.Plant Phenomics, 2019:2562630,

work page doi:10.1006/cgip.1994.1042 1994
[21]

doi:10.34133/2019/2562630. S. Aich and I. Stavness. Leaf counting with deep convolutional and deconvolutional networks. InProceedings of the IEEE International Conference on Computer Vision Workshops (ICCVW), pages 2080–2089,

work page doi:10.34133/2019/2562630 2019
[22]

doi:10.1109/ICCVW.2017.244. Y. Bao, L. Tang, S. Srinivasan, and P. S. Schnable. Field-based architectural traits characterisa- tion of maize plant using time-of-flight 3d imaging.Biosystems Engineering, 178:86–101,

work page doi:10.1109/iccvw.2017.244 2017
[23]

doi:10.1016/j.biosystemseng.2018.11.005. S. Paulus. Measuring crops in 3d: using geometry for plant phenotyping.Plant Methods, 15(1):103,

work page doi:10.1016/j.biosystemseng.2018.11.005 2018
[24]

doi:10.1186/s13007-019-0490-0. C. de Boor.A Practical Guide to Splines. Springer, revised edition,

work page doi:10.1186/s13007-019-0490-0
[25]

doi:10.1007/978-1-4612-6333-3. G. Farin.Curves and Surfaces for CAGD: A Practical Guide. Morgan Kaufmann, 5th edition,

work page doi:10.1007/978-1-4612-6333-3
[26]

doi:10.3389/fpls.2018.00866. F. Magistri, E. Marks, S. Nagulavancha, I. Vizzo, T. Labe, J. Behley, M. Halstead, C. McCool, and C. Stach- niss. Contrastive 3d shape completion and reconstruction for agricultural robots using RGB-D frames. IEEE Robotics and Automation Letters, 7(4):10120–10127,

work page doi:10.3389/fpls.2018.00866 2018
[27]

doi:10.1109/LRA.2022.3193239. A. Tagliasacchi, T. Delame, M. Spagnuolo, N. Amenta, and A. Telea. 3d skeletons: a state-of-the-art report. Computer Graphics Forum, 35(2):573–597,

work page doi:10.1109/lra.2022.3193239 2022
[28]

doi:10.1111/cgf.12865. J. L. Pech-Pacheco, G. Cristobal, J. Chamorro-Martinez, and J. Fernandez-Valdivia. Diatom autofocusing in brightfield microscopy: a comparative study. InProceedings of the 15th International Conference on Pattern Recognition (ICPR), volume 3, pages 314–317,

work page doi:10.1111/cgf.12865
[29]

doi:10.1109/ICPR.2000.903548. X. Zhao, W. Ding, Y. An, Y. Du, T. Yu, M. Li, M. Tang, and J. Wang. Fast segment anything.arXiv preprint arXiv:2306.12156,

work page doi:10.1109/icpr.2000.903548 2000
[30]

doi:10.48550/arXiv.2306.12156. C. Zhang, D. Han, Y. Qiao, J. U. Kim, S.-H. Bae, S. Lee, and C. S. Hong. Faster segment any- thing: Towards lightweight SAM for mobile applications.arXiv preprint arXiv:2306.14289, 2023b. doi:10.48550/arXiv.2306.14289

work page doi:10.48550/arxiv.2306.12156

[1] [1]

doi:10.3390/agronomy11091780. S. Rijal, S. Pokhrel, M. Om, and V. Ojha. Comparing depth estimation of Azure Kinect and RealSense D435i cameras. InProceedings of the Ninth International Congress on Information and Communication Technology (ICICT), pages 491–500. Springer,

work page doi:10.3390/agronomy11091780

[2] [2]

doi:10.1007/978-981-97-3588-4_42. W. Wang and C. Li. Size estimation of sweet onions using consumer-grade RGB-depth sensor.Journal of Food Engineering, 142:153–162,

work page doi:10.1007/978-981-97-3588-4_42

[3] [3]

doi:10.1016/j.jfoodeng.2014.06.019. Z. Chen, Z. Wang, X. Li, J. Zhao, and W. Zhou. Vegetable size measurement based on stereo camera and keypoints detection.Sensors, 22(4):1617,

work page doi:10.1016/j.jfoodeng.2014.06.019 2014

[4] [4]

doi:10.3390/s22041617. 20 A. Patel, Z. Liu, Y. Zhang, and W. Chen. Automated measurement of field crop phenotypic traits us- ing UAV 3D point clouds and an improved PointNet++.Frontiers in Plant Science, 16:1654232,

work page doi:10.3390/s22041617

[5] [5]

doi:10.3389/fpls.2025.1654232. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W. Y. Lo, P. Dollár, and R. Girshick. Segment anything. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 4015–4026,

work page doi:10.3389/fpls.2025.1654232 2025

[6] [6]

doi:10.1109/ICCV51070.2023.00371. L. Zhang, J. Wang, Q. Li, Y. Zhao, and S. Liu. Automatic fish body length measurement based on stereo vision and skeleton extraction.Computers and Electronics in Agriculture, 214:108305, 2023a. doi:10.1016/j.compag.2023.108305. S. Ren, L. Zhang, Z. Li, and T. Liu. Keypoint-based size estimation for irregular root vegetabl...

work page doi:10.1109/iccv51070.2023.00371 2023

[7] [7]

doi:10.1016/j.compag.2024.108703. X. Liu, D. Zhao, W. Jia, W. Ji, C. Ruan, and Y. Sun. Cucumber fruits detection in greenhouses based on instance segmentation.IEEE Access, 7:139635–139642,

work page doi:10.1016/j.compag.2024.108703 2024

[8] [8]

doi:10.1109/ACCESS.2019.2942144. O. M. Lawal. Real-time cucurbit fruit detection in greenhouse using improved YOLO series algorithm. Precision Agriculture, 25:347–359,

work page doi:10.1109/access.2019.2942144 2019

[9] [9]

doi:10.1007/s11119-023-10079-7. A. Koirala, K. B. Walsh, Z. Wang, and C. McCarthy. In-orchard sizing of mango fruit:

work page doi:10.1007/s11119-023-10079-7

[10] [10]

doi:10.3390/horticulturae8121223. P. Song, Z. Li, M. Yang, Y. Shao, Z. Pu, W. Yang, and R. Zhai. Dynamic detection of three-dimensional crop phenotypes based on a consumer-grade RGB-D camera.Frontiers in Plant Science, 14:1097725,

work page doi:10.3390/horticulturae8121223

[11] [11]

doi:10.3389/fpls.2023.1097725. S. J. Hong, J. Kim, and A. Lee. Real-time morphological measurement of oriental melon fruit through multi- depth camera three-dimensional reconstruction.Food and Bioprocess Technology, 17:5038–5052,

work page doi:10.3389/fpls.2023.1097725 2023

[12] [12]

doi:10.1007/s11947-024-03367-9. C. H. Türkseven, M. Jahanbanifard, A. Verma, and Z. A. Becer. Seedling-lump integrated non-destructive monitoring for automatic transplanting with Intel RealSense depth camera.Smart Agricultural Technology, 1:100015,

work page doi:10.1007/s11947-024-03367-9

[13] [13]

Rajmeet Singh, Asim Khan, Lakmal Seneviratne, and Irfan Hussain

doi:10.1016/j.atech.2021.100015. Rajmeet Singh, Asim Khan, Lakmal Seneviratne, and Irfan Hussain. Deep learning approach for detecting tomato flowers and buds in greenhouses on 3p2r gantry robot.Scientific Reports, 14(1):20552,

work page doi:10.1016/j.atech.2021.100015 2021

[14] [14]

doi:10.1109/CVPR.2016.91. C. Y. Wang, A. Bochkovskiy, and H. Y. M. Liao. YOLOv7-hv: Selective fruit harvesting pre- diction and 6D pose estimation.Computers and Electronics in Agriculture, 226:109362,

work page doi:10.1109/cvpr.2016.91 2016

[15] [15]

Ultralytics

doi:10.1016/j.compag.2024.109362. Ultralytics. Ultralytics YOLO documentation.https://docs.ultralytics.com,

work page doi:10.1016/j.compag.2024.109362 2024

[16] [16]

doi:10.1109/ICCV.2017.322. A. Carraro, M. Sozzi, and F. Marinello. The Segment Anything Model (SAM) for accelerating the smart farming revolution.Smart Agricultural Technology, 5:100292,

work page doi:10.1109/iccv.2017.322 2017

[17] [17]

doi:10.1016/j.atech.2023.100292. 21 H. Williams, J. Pham, and L. He. Leaf only SAM: A segment anything pipeline for zero-shot automated leaf segmentation.Frontiers in Plant Science, 15:1373629,

work page doi:10.1016/j.atech.2023.100292 2023

[18] [18]

doi:10.3389/fpls.2024.1373629. M. Kaur, R. Singh, S. Alirezaee, and I. Hussain. Visual-language transformer-based tomato leaf disease detection for portable greenhouse monitoring device.Plant Methods, 21(1):139,

work page doi:10.3389/fpls.2024.1373629 2024

[19] [19]

doi:10.1186/s13007- 025-01339-w. N. Ravi, V. Gabeur, Y. T. Hu, R. Hu, C. Ryali, T. Ma, H. Khedr, R. Rädle, C. Rolland, L. Gustafson, E. Mintun, J. Pan, K. V. Alwala, N. Carion, C. Y. Wu, R. Girshick, P. Dollár, and C. Feichtenhofer. SAM 2: Segment anything in images and videos.https://arxiv.org/abs/2408.00714,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1186/s13007-

[20] [20]

doi:10.1006/cgip.1994.1042. D. Wu, W. Wu, X. Luo, and M. Li. A high-throughput phenotyping pipeline for image analysis of rice panicle architecture.Plant Phenomics, 2019:2562630,

work page doi:10.1006/cgip.1994.1042 1994

[21] [21]

doi:10.34133/2019/2562630. S. Aich and I. Stavness. Leaf counting with deep convolutional and deconvolutional networks. InProceedings of the IEEE International Conference on Computer Vision Workshops (ICCVW), pages 2080–2089,

work page doi:10.34133/2019/2562630 2019

[22] [22]

doi:10.1109/ICCVW.2017.244. Y. Bao, L. Tang, S. Srinivasan, and P. S. Schnable. Field-based architectural traits characterisa- tion of maize plant using time-of-flight 3d imaging.Biosystems Engineering, 178:86–101,

work page doi:10.1109/iccvw.2017.244 2017

[23] [23]

doi:10.1016/j.biosystemseng.2018.11.005. S. Paulus. Measuring crops in 3d: using geometry for plant phenotyping.Plant Methods, 15(1):103,

work page doi:10.1016/j.biosystemseng.2018.11.005 2018

[24] [24]

doi:10.1186/s13007-019-0490-0. C. de Boor.A Practical Guide to Splines. Springer, revised edition,

work page doi:10.1186/s13007-019-0490-0

[25] [25]

doi:10.1007/978-1-4612-6333-3. G. Farin.Curves and Surfaces for CAGD: A Practical Guide. Morgan Kaufmann, 5th edition,

work page doi:10.1007/978-1-4612-6333-3

[26] [26]

doi:10.3389/fpls.2018.00866. F. Magistri, E. Marks, S. Nagulavancha, I. Vizzo, T. Labe, J. Behley, M. Halstead, C. McCool, and C. Stach- niss. Contrastive 3d shape completion and reconstruction for agricultural robots using RGB-D frames. IEEE Robotics and Automation Letters, 7(4):10120–10127,

work page doi:10.3389/fpls.2018.00866 2018

[27] [27]

doi:10.1109/LRA.2022.3193239. A. Tagliasacchi, T. Delame, M. Spagnuolo, N. Amenta, and A. Telea. 3d skeletons: a state-of-the-art report. Computer Graphics Forum, 35(2):573–597,

work page doi:10.1109/lra.2022.3193239 2022

[28] [28]

doi:10.1111/cgf.12865. J. L. Pech-Pacheco, G. Cristobal, J. Chamorro-Martinez, and J. Fernandez-Valdivia. Diatom autofocusing in brightfield microscopy: a comparative study. InProceedings of the 15th International Conference on Pattern Recognition (ICPR), volume 3, pages 314–317,

work page doi:10.1111/cgf.12865

[29] [29]

doi:10.1109/ICPR.2000.903548. X. Zhao, W. Ding, Y. An, Y. Du, T. Yu, M. Li, M. Tang, and J. Wang. Fast segment anything.arXiv preprint arXiv:2306.12156,

work page doi:10.1109/icpr.2000.903548 2000

[30] [30]

doi:10.48550/arXiv.2306.12156. C. Zhang, D. Han, Y. Qiao, J. U. Kim, S.-H. Bae, S. Lee, and C. S. Hong. Faster segment any- thing: Towards lightweight SAM for mobile applications.arXiv preprint arXiv:2306.14289, 2023b. doi:10.48550/arXiv.2306.14289

work page doi:10.48550/arxiv.2306.12156