Vines-DB: An RGB image dataset for multi-species ornamental vine segmentation

Aaron Etienne; Saroj Burlakoti; Shital Poudyal (Utah State University); Utsav Bhandari

arxiv: 2606.18484 · v1 · pith:UAE4TPJOnew · submitted 2026-06-16 · 💻 cs.CV

Vines-DB: An RGB image dataset for multi-species ornamental vine segmentation

Saroj Burlakoti , Utsav Bhandari , Aaron Etienne , Shital Poudyal (Utah State University) This is my paper

Pith reviewed 2026-06-27 00:49 UTC · model grok-4.3

classification 💻 cs.CV

keywords Vines-DBornamental vinesinstance segmentationRGB datasetdeep learninghorticulturespecies identificationcanopy estimation

0 comments

The pith

Vines-DB supplies 1,218 field RGB images of seven vine species with polygon masks for instance segmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Vines-DB, a collection of 1,218 high-resolution RGB images of seven ornamental vine species captured repeatedly across growing seasons from plants on trellises. Images were taken under controlled daylight conditions against uniform backdrops and then manually annotated with polygon masks for each species plus a background class. The set expands to 2,307 images after augmentation and is split into training, validation, and test portions. The resource targets deep learning models for multi-class instance segmentation tasks in precision horticulture and urban ecology. A sympathetic reader would care because such a dataset could support automated tools for canopy cover estimation, species identification, and phenotyping where field monitoring is otherwise labor intensive.

Core claim

Vines-DB contains 1,218 original images from 168 individual plants of seven species photographed monthly in 2023 and 2024, annotated with eight-class polygon instance segmentation masks, then expanded to 2,307 images to support model development and evaluation for multi-class segmentation under realistic field conditions.

What carries the argument

The polygon-based instance segmentation masks for the seven vine species plus background, produced by manual annotation in Roboflow on images taken at 1 m distance against Styrofoam backdrops.

Load-bearing premise

The manual polygon annotations produced by trained annotators are sufficiently accurate and consistent to serve as reliable ground truth for model training and benchmarking.

What would settle it

A direct comparison of segmentation model accuracy when trained on the released annotations versus on a fresh set of images re-annotated by a second independent group of trained annotators.

Figures

Figures reproduced from arXiv: 2606.18484 by Aaron Etienne, Saroj Burlakoti, Shital Poudyal (Utah State University), Utsav Bhandari.

**Figure 1.** Figure 1: Sample images from the Vine-DB dataset. The left panel shows raw RGB field images, while the right panel shows corresponding expert-annotated instance segmentation masks overlaid on the original images. EXPERIMENTAL DESIGN, MATERIALS AND METHODS Study site and experimental setup Data were collected at the Utah Agricultural Experiment Station, Greenville Research Farm, Logan, Utah, USA. The study was conduc… view at source ↗

read the original abstract

The Vines-DB dataset contains 1,218 original high-resolution RGB images of seven ornamental vine species collected under field conditions at the Utah Agricultural Experiment Station's Greenville Research Farm in Logan, Utah, USA. The dataset was generated from 168 individual vine plants that were transplanted in 2022 and photographed repeatedly across multiple months during the 2023 and 2024 growing seasons (July-October). Images were captured with an iPhone 16 Pro equipped with a 48 MP camera between 10:00 AM and 12:00 PM under daylight. Vines were grown on 1.2m x 2.4m trellises and photographed from a distance of 1m against black or white Styrofoam backdrops to improve contrast and reduce background noise. The dataset includes Akebia quinata, Campsis radicans, Hydrangea anomala petiolaris, Lonicera x heckrottii, Campsis x tagliabuana 'Madame Galen', Parthenocissus quinquefolia, and Wisteria floribunda. All original images were manually annotated in Roboflow by trained annotators to produce polygon-based instance segmentation masks for eight classes, including seven species and background. After preprocessing and data augmentation, the working dataset was expanded to 2,307 images for model development and evaluation. The augmented dataset was divided into 2,019 training images, 192 validation images, and 96 test images using stratified sampling to maintain balanced representation. Vines-DB supports the development and evaluation of deep learning models for multi-class instance segmentation in precision horticulture and urban ecology. The dataset enables applications such as automated canopy cover estimation, species identification, and scalable field phenotyping. In addition, repeated monthly imaging of the plants captures temporal variation in canopy development and plant appearance, increasing the dataset's utility for segmentation benchmarking under realistic field conditions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Vines-DB is a new dataset of 1,218 field images for seven ornamental vines with clear collection and split details, but the polygon annotations lack any reported quality checks.

read the letter

The main thing to know is that this paper releases an original collection of 1,218 RGB images of seven specific ornamental vine species, taken repeatedly from the same plants over two growing seasons at a Utah research farm, with polygon masks for eight-class instance segmentation.

The paper does a solid job on the practical side. It spells out the species list, the iPhone capture protocol with backdrops, the 168 plants involved, the monthly timing across 2023-2024, the augmentation steps that expand it to 2,307 images, and the stratified train-val-test split. The temporal coverage is a genuine addition for anyone wanting to test models on changing canopy appearance.

The soft spot is the annotations. They were done manually in Roboflow by trained annotators, but the paper gives no inter-annotator agreement scores, no boundary precision checks, and no expert validation. For a segmentation dataset meant to support model benchmarking, that leaves the ground truth reliability unproven and could affect any downstream IoU or mAP numbers.

This is for computer vision researchers working on precision horticulture or urban plant applications who need data on these exact vines. It will not move the broader field but could serve as a practical resource for niche training or evaluation if the images and masks are released publicly.

I would send it to peer review to confirm data availability and to see whether they can add basic annotation quality metrics.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces Vines-DB, a dataset of 1,218 original high-resolution RGB images of seven ornamental vine species (Akebia quinata, Campsis radicans, Hydrangea anomala petiolaris, Lonicera x heckrottii, Campsis x tagliabuana 'Madame Galen', Parthenocissus quinquefolia, and Wisteria floribunda) collected under field conditions at a Utah research farm. Images were captured with an iPhone 16 Pro against backdrops, manually annotated in Roboflow with polygon masks for eight classes (seven species plus background), augmented to 2,307 images, and stratified into 2,019/192/96 train/val/test splits. The work positions the dataset as a resource for training and evaluating deep learning models for multi-class instance segmentation in precision horticulture and urban ecology, including canopy estimation, species ID, and phenotyping, with temporal variation from repeated seasonal imaging.

Significance. If the polygon annotations are shown to be reliable, Vines-DB would represent a useful addition to the limited set of public datasets for multi-species plant segmentation under realistic but controlled field conditions. The concrete details on collection protocol, species diversity, temporal sampling across 2023-2024, and stratified split strengthen its potential utility for model benchmarking in computer vision applications to horticulture. The absence of quantitative annotation validation metrics, however, limits the strength of claims about enabling reliable model development and evaluation.

major comments (2)

[Abstract] Abstract (annotation paragraph): The statement that 'All original images were manually annotated in Roboflow by trained annotators to produce polygon-based instance segmentation masks for eight classes' provides no quantitative annotation quality metrics (e.g., inter-annotator agreement via mean IoU or Dice, boundary precision, or expert review). This directly undermines the central claim that the dataset supports reliable training and benchmarking of segmentation models, because test-set mAP/IoU could reflect annotation artifacts rather than model performance.
[Dataset description] Dataset description (methods-equivalent section): While the image acquisition protocol (camera model, time of day, distance, trellis size, backdrops) is specified in concrete detail, the annotation workflow is described only at the level of 'trained annotators' with no further information on annotator count, training procedure, quality-control steps, or consistency checks. This is load-bearing for the weakest assumption that the Roboflow masks constitute trustworthy ground truth.

minor comments (2)

[Abstract] Abstract: The augmentation and preprocessing steps are mentioned but not enumerated (e.g., which geometric or photometric transforms were applied); adding a brief list would improve reproducibility.
[Abstract] Species list: Scientific names are given without consistent italicization or authority citations; minor formatting consistency would aid readability.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback and the recommendation for minor revision. The comments correctly identify areas where the annotation process description can be strengthened. We respond point by point below.

read point-by-point responses

Referee: [Abstract] Abstract (annotation paragraph): The statement that 'All original images were manually annotated in Roboflow by trained annotators to produce polygon-based instance segmentation masks for eight classes' provides no quantitative annotation quality metrics (e.g., inter-annotator agreement via mean IoU or Dice, boundary precision, or expert review). This directly undermines the central claim that the dataset supports reliable training and benchmarking of segmentation models, because test-set mAP/IoU could reflect annotation artifacts rather than model performance.

Authors: We agree that the lack of quantitative annotation quality metrics is a limitation. Inter-annotator agreement and similar metrics were not computed during dataset creation. We will revise the abstract to provide a clearer description of the annotation process and to explicitly note the absence of formal quantitative validation metrics as a limitation, so that users can interpret benchmark results accordingly. revision: partial
Referee: [Dataset description] Dataset description (methods-equivalent section): While the image acquisition protocol (camera model, time of day, distance, trellis size, backdrops) is specified in concrete detail, the annotation workflow is described only at the level of 'trained annotators' with no further information on annotator count, training procedure, quality-control steps, or consistency checks. This is load-bearing for the weakest assumption that the Roboflow masks constitute trustworthy ground truth.

Authors: We will expand the dataset description section to include additional details on the annotation workflow, specifically the number of annotators, the training procedure, and the quality-control steps used when creating the polygon masks in Roboflow. revision: yes

standing simulated objections not resolved

Quantitative inter-annotator agreement metrics (e.g., mean IoU) are unavailable because they were not computed as part of the original annotation process.

Circularity Check

0 steps flagged

Dataset release paper contains no derivations, fits, or predictions

full rationale

The manuscript is a straightforward dataset paper describing image collection, annotation in Roboflow, and augmentation/split. No equations, parameter fitting, model predictions, or self-citation chains appear in the provided text. The central claim (utility for DL segmentation) rests on the external validity of the released data rather than any internal reduction to inputs. This matches the default expectation of no circularity for non-derivational work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the quality of human annotations and the representativeness of the collected images; no free parameters, invented entities, or non-standard mathematical axioms are introduced.

axioms (1)

domain assumption Manual polygon annotations by trained annotators produce accurate instance segmentation masks suitable for model training.
Invoked when stating that all original images were annotated to produce the masks used for model development.

pith-pipeline@v0.9.1-grok · 5898 in / 1315 out tokens · 37460 ms · 2026-06-27T00:49:06.751950+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

9 extracted references · 7 canonical work pages

[1]

Fuentes, C

S. Fuentes, C. Poblete-Echeverría, S. Ortega-Farias, S. Tyerman, R. Bei, Automated estimation of Leaf Area Index from grapevine canopies using cover photography, video and computational analysis methods, Australian Journal of Grape and Wine Research (2014). https://doi.org/10.1111/ajgw.12098

work page doi:10.1111/ajgw.12098 2014
[2]

Xiong, C

Y . Xiong, C. p. West, C. p. Brown, P . e. Green, Digital Image Analysis of Old World Bluestem Cover to Estimate Canopy Development, Agronomy Journal 111 (2019) 1247–1253. https://doi.org/10.2134/agronj2018.08.0502

work page doi:10.2134/agronj2018.08.0502 2019
[3]

Seidel, S

D. Seidel, S. Fleck, C. Leuschner, T. Hammett, Review of ground-based methods to measure the distribution of biomass in forest canopies, Annals of Forest Science 68 (2011) 225–244. https://doi.org/10.1007/s13595-011-0040-z

work page doi:10.1007/s13595-011-0040-z 2011
[4]

Bhandari, A

U. Bhandari, A. Etienne, Precision weed detection using UAVs and deep learning: Models, paradigms, and challenges, Smart Agricultural Technology 13 (2026) 101656. https://doi.org/10.1016/j.atech.2025.101656

work page doi:10.1016/j.atech.2025.101656 2026
[5]

Burlakoti, Y

S. Burlakoti, Y . Sun, S. Poudyal, Vines in the Landscape: Virginia Creeper, All Current Publications (2026) 1–5

2026
[6]

Burlakoti, Y

S. Burlakoti, Y . Sun, S. Poudyal, Vines in the Landscape: Goldflame Honeysuckle, All Current Publications (2026) 1–5

2026
[7]

Khoshgoftaar , title =

C. Shorten, T.M. Khoshgoftaar, A survey on Image Data Augmentation for Deep Learning, J Big Data 6 (2019) 60. https://doi.org/10.1186/s40537-019-0197-0

work page doi:10.1186/s40537-019-0197-0 2019
[8]

Frontiers in Astronomy and Space Sciences , keywords =

H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, S. Savarese, Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression, in: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019: pp. 658–666. https://doi.org/10.1109/CVPR.2019.00075

work page doi:10.1109/cvpr.2019.00075 2019
[9]

Straßburger (2013): Cut Elimination in Nested Sequents for Intuitionistic Modal Logics

K. Sechidis, G. Tsoumakas, I. Vlahavas, On the Stratification of Multi-label Data, in: D. Gunopulos, T. Hofmann, D. Malerba, M. Vazirgiannis (Eds.), Machine Learning and Knowledge Discovery in Databases, Springer, Berlin, Heidelberg, 2011: pp. 145–158. https://doi.org/10.1007/978-3-642- 23808-6_10

work page doi:10.1007/978-3-642- 2011

[1] [1]

Fuentes, C

S. Fuentes, C. Poblete-Echeverría, S. Ortega-Farias, S. Tyerman, R. Bei, Automated estimation of Leaf Area Index from grapevine canopies using cover photography, video and computational analysis methods, Australian Journal of Grape and Wine Research (2014). https://doi.org/10.1111/ajgw.12098

work page doi:10.1111/ajgw.12098 2014

[2] [2]

Xiong, C

Y . Xiong, C. p. West, C. p. Brown, P . e. Green, Digital Image Analysis of Old World Bluestem Cover to Estimate Canopy Development, Agronomy Journal 111 (2019) 1247–1253. https://doi.org/10.2134/agronj2018.08.0502

work page doi:10.2134/agronj2018.08.0502 2019

[3] [3]

Seidel, S

D. Seidel, S. Fleck, C. Leuschner, T. Hammett, Review of ground-based methods to measure the distribution of biomass in forest canopies, Annals of Forest Science 68 (2011) 225–244. https://doi.org/10.1007/s13595-011-0040-z

work page doi:10.1007/s13595-011-0040-z 2011

[4] [4]

Bhandari, A

U. Bhandari, A. Etienne, Precision weed detection using UAVs and deep learning: Models, paradigms, and challenges, Smart Agricultural Technology 13 (2026) 101656. https://doi.org/10.1016/j.atech.2025.101656

work page doi:10.1016/j.atech.2025.101656 2026

[5] [5]

Burlakoti, Y

S. Burlakoti, Y . Sun, S. Poudyal, Vines in the Landscape: Virginia Creeper, All Current Publications (2026) 1–5

2026

[6] [6]

Burlakoti, Y

S. Burlakoti, Y . Sun, S. Poudyal, Vines in the Landscape: Goldflame Honeysuckle, All Current Publications (2026) 1–5

2026

[7] [7]

Khoshgoftaar , title =

C. Shorten, T.M. Khoshgoftaar, A survey on Image Data Augmentation for Deep Learning, J Big Data 6 (2019) 60. https://doi.org/10.1186/s40537-019-0197-0

work page doi:10.1186/s40537-019-0197-0 2019

[8] [8]

Frontiers in Astronomy and Space Sciences , keywords =

H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, S. Savarese, Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression, in: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019: pp. 658–666. https://doi.org/10.1109/CVPR.2019.00075

work page doi:10.1109/cvpr.2019.00075 2019

[9] [9]

Straßburger (2013): Cut Elimination in Nested Sequents for Intuitionistic Modal Logics

K. Sechidis, G. Tsoumakas, I. Vlahavas, On the Stratification of Multi-label Data, in: D. Gunopulos, T. Hofmann, D. Malerba, M. Vazirgiannis (Eds.), Machine Learning and Knowledge Discovery in Databases, Springer, Berlin, Heidelberg, 2011: pp. 145–158. https://doi.org/10.1007/978-3-642- 23808-6_10

work page doi:10.1007/978-3-642- 2011