ReLeaf: Benchmarking Leaf Segmentation across Domains and Species

Robert Martinko , Daniel Steininger , Julia Simon , Andreas Trondl , Matthias Blaickner

Authors on Pith no claims yet

Pith reviewed 2026-05-07 17:58 UTC · model grok-4.3

classification 💻 cs.CV

keywords datasetsspeciesacrossdataplantbenchmarkespeciallyfour

0 comments

The pith

A YOLO26 model trained on four leaf segmentation datasets reaches 83.9% mean mAP50-95 on their test sets but only 40.2% on a new 23-species benchmark, revealing substantial cross-domain generalization gaps.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Leaf segmentation means using AI to outline individual leaves in photos of plants. This is useful because it lets farmers or machines check each leaf for health, growth, or problems like disease or stress. The paper looks at four existing sets of labeled leaf photos from different plants. They test several AI models designed for finding and outlining objects in images, including YOLO versions, two-stage detectors, and ones using transformers. One YOLO26 setup gave the best balance of accuracy and speed for practical use. They also test how these models perform when the plants or the way photos are taken are different from what the model saw during training. Performance often drops a lot, especially for models trained only on lab photos. To improve this, the authors created a new collection of photos with accurate leaf outlines for 23 different plant species. They used a semi-automatic way to label them from existing crop images. Training a model on all the old datasets gives strong results on those old test photos but much weaker results on the new collection. This points to the value of having training data that covers many species and conditions for building reliable tools in farming.

Core claim

A model trained on all four existing datasets achieves a mean mAP50-95 of 83.9% across their corresponding test sets and 40.2% on our new benchmark, demonstrating improved generalization and highlighting the need for diverse leaf-segmentation datasets in robust precision agriculture.

Load-bearing premise

The semi-automatic annotation process for the new 23-species dataset produces sufficiently accurate ground-truth masks, and the four selected public datasets adequately represent the range of species and imaging conditions encountered in real precision agriculture.

Figures

Figures reproduced from arXiv: 2605.03784 by Andreas Trondl, Daniel Steininger, Julia Simon, Matthias Blaickner, Robert Martinko.

**Figure 1.** Figure 1: Representative samples from view at source ↗

**Figure 2.** Figure 2: Representative processing pipeline, illustrating the typical context of leaf segmentation. Image patches depicting individual view at source ↗

**Figure 3.** Figure 3: Representative images from leaf-segmentation view at source ↗

**Figure 5.** Figure 5: Qualitative leaf-segmentation results on representative view at source ↗

**Figure 6.** Figure 6: Representative leaf-segmentation results of YOLO26 view at source ↗

**Figure 7.** Figure 7: Comparison of bounding-box accuracy (% mAP view at source ↗

**Figure 8.** Figure 8: Representative leaf-segmentation results of selected view at source ↗

**Figure 9.** Figure 9: Representative leaf-segmentation results of YOLO26 models ( view at source ↗

**Figure 10.** Figure 10: Representative leaf-segmentation results of YOLO26 models ( view at source ↗

read the original abstract

Rising global food demand and growing climate pressure increase the need for sustainable, precise agricultural practices. Automated, individualized plant treatment relies on fine-grained visual analysis, yet leaf-level segmentation remains underexplored despite its value for assessing crop health, growth dynamics, yield potential and localized stress symptoms. Progress is limited by a lack of dedicated datasets, especially regarding species coverage, and by the absence of systematic evaluations of modern instance-segmentation architectures for this task. We address these gaps by surveying current data and identifying four suitable, publicly available leaf-segmentation datasets. Using them, we compare one-stage, two-stage and Transformer-based detectors and identify a YOLO26 model configuration to provide the best trade-off for real-world precision-agriculture tasks. Extensive cross-domain generalization experiments reveal substantial performance drops across plant species and recording setups, especially for models trained solely on laboratory data. To strengthen data availability, we introduce a new benchmark dataset with leaf-level masks for 23 plant species, created via semi-automatic annotation of selected CropAndWeed images. A model trained on all four existing datasets achieves a mean mAP50-95 of 83.9% across their corresponding test sets and 40.2% on our new benchmark, demonstrating improved generalization and highlighting the need for diverse leaf-segmentation datasets in robust precision agriculture.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical benchmarking study in computer vision. No mathematical axioms, free parameters fitted to the central claim, or newly invented entities are introduced; model selection and hyperparameters follow standard practices for the cited architectures.

pith-pipeline@v0.9.0 · 5540 in / 1209 out tokens · 141788 ms · 2026-05-07T17:58:24.538342+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

45 extracted references · 5 canonical work pages · 2 internal anchors

[1]

Leaf counting with deep convolutional and deconvolutional networks

Shubhra Aich and Ian Stavness. Leaf counting with deep convolutional and deconvolutional networks. InProceed- ings of the IEEE international conference on computer vision workshops, pages 2080–2089, 2017. 2

2080
[2]

Aberystwyth leaf evaluation dataset: a plant growth visible light image dataset of ara- bidopsis thaliana.Zenodo, 2016

Jonathan Bell and HM Dee. Aberystwyth leaf evaluation dataset: a plant growth visible light image dataset of ara- bidopsis thaliana.Zenodo, 2016. 3, 4

2016
[3]

Combining domain adaptation and spatial consistency for unseen fruits counting: a quasi-unsupervised approach.IEEE Robotics and Automa- tion Letters, 5(2):1079–1086, 2020

Enrico Bellocchio, Gabriele Costante, Silvia Cascianelli, Mario Luca Fravolini, and Paolo Valigi. Combining domain adaptation and spatial consistency for unseen fruits counting: a quasi-unsupervised approach.IEEE Robotics and Automa- tion Letters, 5(2):1079–1086, 2020. 2

2020
[4]

Yolact: Real-time instance segmentation

Daniel Bolya, Chong Zhou, Fanyi Xiao, and Yong Jae Lee. Yolact: Real-time instance segmentation. InProceedings of the IEEE/CVF international conference on computer vision, pages 9157–9166, 2019. 2

2019
[5]

The future challenges of food and agriculture: An integrated analysis of trends and solutions.Sustainability, 11(1):222, 2019

Ozgul Calicioglu, Alessandro Flammini, Stefania Bracco, Lorenzo Bell `u, and Ralph Sims. The future challenges of food and agriculture: An integrated analysis of trends and solutions.Sustainability, 11(1):222, 2019. 1

2019
[6]

End-to- end object detection with transformers

Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to- end object detection with transformers. InEuropean confer- ence on computer vision, pages 213–229. Springer, 2020. 2

2020
[7]

Weedsgalore: A multispectral and multitemporal uav-based dataset for crop and weed segmentation in agricultural maize fields

Ekin Celikkan, Timo Kunzmann, Yertay Yeskaliyev, Sibylle Itzerott, Nadja Klein, and Martin Herold. Weedsgalore: A multispectral and multitemporal uav-based dataset for crop and weed segmentation in agricultural maize fields. InPro- ceedings of the Winter Conference on Applications of Com- puter Vision (WACV), 2025. 3

2025
[8]

Computer vision annotation tool (cvat).https://github.com/opencv/cvat, 2026

CV AT.ai Corporation. Computer vision annotation tool (cvat).https://github.com/opencv/cvat, 2026. 4

2026
[9]

Automated identification of northern leaf blight-infected maize plants from field imagery using deep learning.Phytopathology, 107(11):1426–1432, 2017

Chad DeChant, Tyr Wiesner-Hanks, Siyuan Chen, Ethan L Stewart, Jason Yosinski, Michael A Gore, Rebecca J Nelson, and Hod Lipson. Automated identification of northern leaf blight-infected maize plants from field imagery using deep learning.Phytopathology, 107(11):1426–1432, 2017. 1

2017
[10]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In2009 IEEE conference on computer vision and pattern recognition, pages 248–255. IEEE, 2009. 2

2009
[11]

An overview of global leaf area index (lai): Methods, products, validation, and applications

Hongliang Fang, Frederic Baret, Stephen Plummer, and Gabriela Schaepman-Strub. An overview of global leaf area index (lai): Methods, products, validation, and applications. Reviews of Geophysics, 57(3):739–799, 2019. 1

2019
[12]

Crop and weeds classification for precision agriculture using context-independent pixel-wise segmentation

Mulham Fawakherji, Ali Youssef, Domenico Bloisi, Alberto Pretto, and Daniele Nardi. Crop and weeds classification for precision agriculture using context-independent pixel-wise segmentation. In2019 third IEEE international conference on robotic computing (IRC), pages 146–152. IEEE, 2019. 2

2019
[13]

Learning to count leaves in rosette plants.XXX,

Mario Valerio Giuffrida, Massimo Minervini, and Sotirios A Tsaftaris. Learning to count leaves in rosette plants.XXX,
[14]

Poly-yolo: higher speed, more precise detection and instance segmentation for yolov3.Neural Computing and Applications, 34(10):8275– 8290, 2022

Petr Hurtik, V ojtech Molek, Jan Hula, Marek Vajgl, Pavel Vlasanek, and Tomas Nejezchleba. Poly-yolo: higher speed, more precise detection and instance segmentation for yolov3.Neural Computing and Applications, 34(10):8275– 8290, 2022. 2

2022
[15]

Automatic leaf segmenta- tion for estimating leaf area and leaf inclination angle in 3d plant images.Sensors, 18(10):3576, 2018

Kenta Itakura and Fumiki Hosoi. Automatic leaf segmenta- tion for estimating leaf area and leaf inclination angle in 3d plant images.Sensors, 18(10):3576, 2018. 1

2018
[16]

Ultralyt- ics yolo.https://github.com/ultralytics/ ultralytics, 2024

Glenn Jocher, Ayush Chaurasia, and Jing Qiu. Ultralyt- ics yolo.https://github.com/ultralytics/ ultralytics, 2024. Accessed: 2025-10-11. 2, 5, 1

2024
[17]

Objectdetection in agriculture: A comprehensive review of methods, applica- tions, challenges, and future directions.Agriculture, 15(13): 1351, 2025

Zohaib Khan, Yue Shen, and Hui Liu. Objectdetection in agriculture: A comprehensive review of methods, applica- tions, challenges, and future directions.Agriculture, 15(13): 1351, 2025. 3

2025
[18]

Growliflower: An image time-series dataset for growth anal- ysis of cauliflower.Journal of Field Robotics, 40(2):173– 192, 2023

Jana Kierdorf, Laura Verena Junker-Frohn, Mike De- laney, Mariele Donoso Olave, Andreas Burkart, Hannah Jaenicke, Onno Muller, Uwe Rascher, and Ribana Roscher. Growliflower: An image time-series dataset for growth anal- ysis of cauliflower.Journal of Field Robotics, 40(2):173– 192, 2023. 3, 4, 8, 2

2023
[19]

Data augmentation for leaf segmentation and counting tasks in rosette plants

Dmitry Kuznichov, Alon Zvirin, Yaron Honen, and Ron Kimmel. Data augmentation for leaf segmentation and counting tasks in rosette plants. InCVPR Workshops, pages 0–0, 2019. 2

2019
[20]

Sweet pepper pose detection and grasping for automated crop harvesting

Christopher Lehnert, Inkyu Sa, Christopher McCool, Ben Upcroft, and Tristan Perez. Sweet pepper pose detection and grasping for automated crop harvesting. In2016 IEEE ICRA, pages 2428–2434. IEEE, 2016. 1

2016
[21]

Deep learning implementation of image segmentation in agricultural applications: A comprehensive review.Artificial Intelligence Review, 57(6):149, 2024

Lian Lei, Qiliang Yang, Ling Yang, Tao Shen, Ruoxi Wang, and Chengbiao Fu. Deep learning implementation of image segmentation in agricultural applications: A comprehensive review.Artificial Intelligence Review, 57(6):149, 2024. 3

2024
[22]

Rui Li, Shunyi Zheng, Ce Zhang, Chenxi Duan, Libo Wang, and Peter M Atkinson. Abcnet: Attentive bilateral con- textual network for efficient semantic segmentation of fine- resolution remotely sensed imagery.ISPRS journal of pho- togrammetry and remote sensing, 181:84–98, 2021. 2

2021
[23]

Microsoft coco: Common objects in context

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer, 2014. 2, 5

2014
[24]

Rt-detrv2: Improved base- line with bag-of-freebies for real-time detection transformer

Wenyu Lv, Yian Zhao, Qinyao Chang, Kui Huang, Guanzhong Wang, and Yi Liu. Rt-detrv2: Improved base- line with bag-of-freebies for real-time detection transformer. arXiv preprint arXiv:2407.17140, 2024. 1

work page arXiv 2024
[25]

DINOv2: Learning Robust Visual Features without Supervision

Maxime Oquab, Timoth ´ee Darcet, Th ´eo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023. 5

work page internal anchor Pith review arXiv 2023
[26]

End-to-end instance seg- mentation with recurrent attention

Mengye Ren and Richard S Zemel. End-to-end instance seg- mentation with recurrent attention. InCVPR, pages 6656– 6664, 2017. 2

2017
[27]

Rf-detr: neural architecture search for real-time detection transformers.arXiv preprint arXiv:2511.09554, 2025

Isaac Robinson, Peter Robicheaux, Matvei Popov, Deva Ramanan, and Neehar Peri. Rf-detr: neural architecture search for real-time detection transformers.arXiv preprint arXiv:2511.09554, 2025. 2, 5 9

work page arXiv 2025
[28]

Hierarchical approach for joint semantic, plant instance, and leaf instance segmentation in the agricultural domain.arXiv preprint arXiv:2210.07879, 2022

Gianmarco Roggiolani, Matteo Sodano, Tiziano Guadagnino, Federico Magistri, Jens Behley, and Cyrill Stachniss. Hierarchical approach for joint semantic, plant instance, and leaf instance segmentation in the agricultural domain.arXiv preprint arXiv:2210.07879, 2022. 2

work page arXiv 2022
[29]

U-net: Convolutional networks for biomedical image segmentation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. InMICCAI, pages 234–241. Springer, 2015. 2

2015
[30]

Plant leaf disease detection and classification based on cnn with lvq algorithm

Melike Sardogan, Adem Tuncer, and Yunus Ozen. Plant leaf disease detection and classification based on cnn with lvq algorithm. In2018 UBMK, pages 382–385. IEEE, 2018. 1

2018
[31]

Annotated image datasets of rosette plants

Hanno Scharr, Massimo Minervini, Andreas Fischbach, and Sotirios A Tsaftaris. Annotated image datasets of rosette plants. InEuropean conference on computer vision, pages 6–12. Suisse Z¨urich, 2014. 3, 4, 8, 2

2014
[32]

Detection of plant leaf diseases using image segmentation and soft computing techniques

Vijai Singh and Ak K Misra. Detection of plant leaf diseases using image segmentation and soft computing techniques. Information Processing in Agriculture, 4(1):41–49, 2017. 1

2017
[33]

The cropandweed dataset: A multi-modal learning approach for efficient crop and weed manipulation

Daniel Steininger, Andreas Trondl, Gerardus Croonen, Julia Simon, and Verena Widhalm. The cropandweed dataset: A multi-modal learning approach for efficient crop and weed manipulation. InProceedings of the IEEE/CVF Winter Con- ference on Applications of Computer Vision, pages 3729– 3738, 2023. 1, 2, 3, 4

2023
[34]

Weed growth stage estimator using deep convolutional neural networks.Sensors, 18(5):1580,

Nima Teimouri, Mads Dyrmann, Per Rydahl Nielsen, Solvejg Kopp Mathiassen, Gayle J Somerville, and Ras- mus Nyholm Jørgensen. Weed growth stage estimator using deep convolutional neural networks.Sensors, 18(5):1580,
[35]

An easy-to-setup 3d phenotyping platform for komatsuna dataset

Hideaki Uchiyama, Shunsuke Sakurai, Masashi Mishima, Daisaku Arita, Takashi Okayasu, Atsushi Shimada, and Rin- ichiro Taniguchi. An easy-to-setup 3d phenotyping platform for komatsuna dataset. InProceedings of the IEEE inter- national conference on computer vision workshops, pages 2038–2045, 2017. 3, 4, 8, 2

2038
[36]

Pixelwise instance segmentation of leaves in dense foliage.Computers and Electronics in Agriculture, 195:106797, 2022

Jehan-Antoine Vayssade, Gawain Jones, Christelle G ´ee, and Jean-No¨el Paoli. Pixelwise instance segmentation of leaves in dense foliage.Computers and Electronics in Agriculture, 195:106797, 2022. 3, 4

2022
[37]

Real-time accurate apple detection based on improved yolov8n in complex natural en- vironments.Plants, 14(3):365, 2025

Mingjie Wang and Fuzhong Li. Real-time accurate apple detection based on improved yolov8n in complex natural en- vironments.Plants, 14(3):365, 2025. 2

2025
[38]

Leaf segmentation using modified yolov8-seg models.Life, 14(6):780, 2024

Peng Wang, Hong Deng, Jiaxu Guo, Siqi Ji, Dan Meng, Jun Bao, and Peng Zuo. Leaf segmentation using modified yolov8-seg models.Life, 14(6):780, 2024. 2

2024
[39]

Segment any leaf 3d: A zero-shot 3d leaf instance segmentation method based on multi-view images.Sensors, 25(2):526, 2025

Yunlong Wang and Zhiyong Zhang. Segment any leaf 3d: A zero-shot 3d leaf instance segmentation method based on multi-view images.Sensors, 25(2):526, 2025. 2

2025
[40]

In-field phenotyping based on crop leaf and plant instance segmentation

Jan Weyler, Federico Magistri, Peter Seitz, Jens Behley, and Cyrill Stachniss. In-field phenotyping based on crop leaf and plant instance segmentation. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2725–2734, 2022. 2, 3

2022
[41]

Jan Weyler, Federico Magistri, Elias Marks, Yue Linn Chong, Matteo Sodano, Gianmarco Roggiolani, Nived Che- brolu, Cyrill Stachniss, and Jens Behley. Phenobench: A large dataset and benchmarks for semantic image interpreta- tion in the agricultural domain.IEEE transactions on pattern analysis and machine intelligence, 46(12):9583–9594, 2024. 3, 4, 6, 8, 2

2024
[42]

Detectron2.https://github

Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick. Detectron2.https://github. com/facebookresearch/detectron2, 2019. 2, 5, 1

2019
[43]

Detrs beat yolos on real-time object detection

Yian Zhao, Wenyu Lv, Shangliang Xu, Jinman Wei, Guanzhong Wang, Qingqing Dang, Yi Liu, and Jie Chen. Detrs beat yolos on real-time object detection. InProceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16965–16974, 2024. 2

2024
[44]

Rethinking semantic segmen- tation from a sequence-to-sequence perspective with trans- formers

Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip HS Torr, et al. Rethinking semantic segmen- tation from a sequence-to-sequence perspective with trans- formers. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6881–6890,
[45]

Deformable DETR: Deformable Transformers for End-to-End Object Detection

Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. Deformable detr: Deformable trans- formers for end-to-end object detection.arXiv preprint arXiv:2010.04159, 2021. 1 10 ReLeaf: Benchmarking Leaf Segmentation across Domains and Species Supplementary Material This supplementary document complements the main paper with additional statis...

work page internal anchor Pith review arXiv 2010