Uncertainty-Guided Attention and Entropy-Weighted Loss for Precise Plant Seedling Segmentation

Ali Hamdi; Mohamed Ehab

arxiv: 2604.10823 · v1 · submitted 2026-04-12 · 💻 cs.CV · cs.LG

Uncertainty-Guided Attention and Entropy-Weighted Loss for Precise Plant Seedling Segmentation

Mohamed Ehab , Ali Hamdi This is my paper

Pith reviewed 2026-05-10 15:18 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords plant seedling segmentationuncertainty-guided attentionentropy-weighted lossdeep supervisionboundary precisionprecision agricultureimage segmentation

0 comments

The pith

Uncertainty-guided attention and entropy-weighted loss sharpen segmentation of fine plant seedling structures.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that feeding uncertainty estimates into both attention layers and the loss function allows standard segmentation networks to handle intricate leaf boundaries and cluttered backgrounds more reliably. This would matter for automated plant phenotyping because it promises higher accuracy in measuring growth traits without manual intervention. The authors assemble UGDA-Net from three additions to U-Net and LinkNet: dual attention modulated by channel variance, a hybrid loss that up-weights high-entropy boundary pixels, and deep supervision on encoder layers. Systematic ablations on 432 high-resolution seedling images report gains in overlap metrics and visibly cleaner boundary predictions, with uncertainty maps matching the observed morphological complexity.

Core claim

The authors claim that uncertainty-guided dual attention, which modulates feature maps via channel variance, combined with an entropy-weighted hybrid loss that emphasizes high-uncertainty boundary pixels and deep supervision on intermediate encoder layers, produces more precise segmentation of plant seedlings than the unmodified base architectures.

What carries the argument

Uncertainty-Guided Dual Attention (UGDA), which uses channel variance to modulate feature maps and direct focus toward uncertain regions.

If this is right

Leaf-boundary false positives decrease when attention and loss both respond to pixel-wise uncertainty.
Uncertainty heatmaps produced by the model align with the fine morphological details of seedlings.
The same components improve both U-Net and LinkNet baselines without architecture-specific redesign.
Deep supervision on encoder layers complements the uncertainty signals to stabilize training for delicate structures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same uncertainty modulation could be tested on other thin-structure tasks such as root or vein segmentation where boundary errors dominate.
Entropy-weighted losses might serve as a drop-in replacement for focal loss in any setting where uncertain pixels coincide with class boundaries.
If uncertainty maps prove reliable, they could guide active selection of new training images that contain the hardest leaf edges.

Load-bearing premise

The reported segmentation gains arise from the three added uncertainty components rather than from any unstated differences in training schedules, augmentations, or hyperparameters across the ablation runs.

What would settle it

A re-training of every ablation configuration on the same data splits using identical augmentation pipelines and optimization settings that yields no Dice improvement would falsify the claim that the uncertainty mechanisms drive the gains.

Figures

Figures reproduced from arXiv: 2604.10823 by Ali Hamdi, Mohamed Ehab.

**Figure 2.** Figure 2: UGDA-Net architecture. The blue boxes show feature maps from the encoder/decoder with the number of channels. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Results of qualitative segmentation on a sample of [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

read the original abstract

Plant seedling segmentation supports automated phenotyping in precision agriculture. Standard segmentation models face difficulties due to intricate background images and fine structures in leaves. We introduce UGDA-Net (Uncertainty-Guided Dual Attention Network with Entropy-Weighted Loss and Deep Supervision). Three novel components make up UGDA-Net. The first component is Uncertainty-Guided Dual Attention (UGDA). UGDA uses channel variance to modulate feature maps. The second component is an entropy-weighted hybrid loss function. This loss function focuses on high-uncertainty boundary pixels. The third component employs deep supervision for intermediate encoder layers. We performed a comprehensive systematic ablation study. This study focuses on two widely-used architectures, U-Net and LinkNet. It analyzes five incremental configurations: Baseline, Loss-only, Attention-only, Deep Supervision, and UGDA-Net. We trained UGDA-net using a high-resolution plant seedling image dataset containing 432 images. We demonstrate improved segmentation performance and accuracy. With an increase in Dice coefficient of 9.3% above baseline. LinkNet's variance is 13.2% above baseline. Overlays that are qualitative in nature show the reduced false positives at the leaf boundary. Uncertainty heatmaps are consistent with the complex morphology. UGDA-Net aids in the segmentation of delicate structures in plants and provides a high-def solution. The results showed that uncertainty-guided attention and uncertainty-weighted loss are two complementing systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper adds channel-variance attention and entropy-weighted boundary loss to U-Net and LinkNet for seedling segmentation, with a reported 9.3% Dice gain, but the ablations do not confirm that training details stayed fixed across variants.

read the letter

The main point is that UGDA-Net layers uncertainty-guided dual attention (via channel variance) and an entropy-weighted hybrid loss onto standard segmentation backbones, plus deep supervision, and reports better masks on a 432-image plant seedling set. The ablation across baseline, loss-only, attention-only, deep supervision, and full model shows progressive Dice lifts and lower variance on LinkNet, with qualitative overlays indicating fewer boundary false positives and uncertainty maps that track leaf complexity. That combination is a reasonable incremental step for handling fine structures in cluttered agricultural images, and the authors do run the same two architectures through all five configurations on the shared data.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes UGDA-Net for segmenting plant seedlings in complex images, featuring three innovations: Uncertainty-Guided Dual Attention (UGDA) that modulates features using channel variance, an entropy-weighted hybrid loss to emphasize uncertain boundary pixels, and deep supervision applied to intermediate encoder layers. The authors conduct an ablation study on U-Net and LinkNet using five configurations (Baseline, Loss-only, Attention-only, Deep Supervision, UGDA-Net) trained on a dataset of 432 images, reporting a 9.3% Dice coefficient improvement over baseline and 13.2% variance improvement for LinkNet, along with qualitative evidence of better boundary handling.

Significance. If the performance gains can be reliably attributed to the proposed components, the use of self-derived uncertainty signals to guide both attention and loss weighting offers a practical way to handle fine leaf structures and cluttered backgrounds in precision agriculture imaging. The paper receives credit for performing a systematic ablation across two standard architectures and for reporting concrete numerical improvements (9.3% Dice, 13.2% variance) rather than qualitative assertions alone. These elements would strengthen the contribution if the experimental controls are clarified.

major comments (2)

[Ablation study] The ablation study description (abstract and experimental results) provides no evidence that learning rate, optimizer, data augmentation policy, epoch count, or random seeds were held fixed across the Baseline, Loss-only, Attention-only, Deep Supervision, and UGDA-Net configurations. Without this control, the reported 9.3% Dice gain and variance reduction cannot be confidently attributed to the three novel components rather than unstated differences in training protocol.
[Experimental evaluation] No information is given on train/validation/test splits for the 432-image dataset, use of cross-validation, error bars, or statistical significance tests for the Dice and variance metrics. These details are required to evaluate whether the claimed improvements are robust.

minor comments (2)

[Abstract] The abstract statement 'LinkNet's variance is 13.2% above baseline' is unclear and appears inconsistent with the surrounding claims of improved accuracy and reduced false positives; please define the variance metric and its direction of improvement.
[Abstract] Several abstract sentences are fragmented or stylistically awkward (e.g., 'Overlays that are qualitative in nature show the reduced false positives at the leaf boundary'). Minor rephrasing would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects of experimental rigor. We address each major comment below and will revise the manuscript accordingly to strengthen the presentation of our ablation study and evaluation protocol.

read point-by-point responses

Referee: [Ablation study] The ablation study description (abstract and experimental results) provides no evidence that learning rate, optimizer, data augmentation policy, epoch count, or random seeds were held fixed across the Baseline, Loss-only, Attention-only, Deep Supervision, and UGDA-Net configurations. Without this control, the reported 9.3% Dice gain and variance reduction cannot be confidently attributed to the three novel components rather than unstated differences in training protocol.

Authors: We agree that identical training protocols across configurations are necessary to attribute gains to the proposed components. In our experiments, the learning rate, optimizer, data augmentation policy, epoch count, and random seeds were held fixed for all five configurations on both U-Net and LinkNet. We will explicitly document these controls in the revised experimental setup section. revision: yes
Referee: [Experimental evaluation] No information is given on train/validation/test splits for the 432-image dataset, use of cross-validation, error bars, or statistical significance tests for the Dice and variance metrics. These details are required to evaluate whether the claimed improvements are robust.

Authors: We acknowledge that these details were omitted. The dataset was divided using a fixed train/validation/test split, cross-validation was not applied given the dataset size, and results were averaged over multiple runs with error bars. We will add the exact split ratios, note the absence of cross-validation, include error bars, and report a statistical significance test in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical ablation claims do not reduce to self-definition or fitted inputs

full rationale

The paper introduces UGDA-Net via three components (uncertainty-guided dual attention using channel variance, entropy-weighted loss, deep supervision) and reports Dice gains from incremental ablations on U-Net/LinkNet. No equations, derivations, or first-principles results are present that equate outputs to inputs by construction. The uncertainty signal is computed from the model's own feature maps or predictions and then applied to modulate attention or loss weights; this is a standard non-tautological design choice and does not make the final Dice coefficient equivalent to the input by definition. No self-citations appear as load-bearing premises, no uniqueness theorems are invoked, and no parameters are fitted on a subset then renamed as predictions. The ablation isolates component effects only insofar as training protocols are held constant (unstated details affect validity, not circularity). The derivation chain is therefore self-contained and non-circular.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard supervised segmentation assumptions plus three ad-hoc design choices whose justification is empirical rather than derived.

free parameters (2)

entropy weight schedule
The hybrid loss uses entropy to weight boundary pixels; the scaling factor or schedule is not stated and must be chosen or fitted.
deep supervision weights
Relative loss weights on intermediate encoder layers are free parameters that affect the reported Dice gain.

axioms (2)

domain assumption Channel variance is a reliable proxy for feature uncertainty
Invoked in the definition of UGDA without external validation or derivation.
domain assumption High-entropy pixels coincide with segmentation boundaries that matter for the task
Used to justify the entropy-weighted loss.

pith-pipeline@v0.9.0 · 5552 in / 1502 out tokens · 65608 ms · 2026-05-10T15:18:16.753791+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · 1 internal anchor

[1]

Deivalakshmi

Shamim Banu and S. Deivalakshmi. Enhancing leaf area segmentation by using attention gates and knowledge distillation in unet architecture. Journal of Telecommunications and Information Technology, 101:51–62, 09 2025

work page 2025
[2]

Albumentations: fast and flexible image augmentations, 09 2018

Alexander Buslaev, Alex Parinov, Eugene Khvedchenya, Vladimir Iglovikov, and Alexandr Kalinin. Albumentations: fast and flexible image augmentations, 09 2018

work page 2018
[3]

Linknet: Exploiting encoder representations for efficient semantic segmentation

Abhishek Chaurasia and Eugenio Culurciello. Linknet: Exploiting encoder representations for efficient semantic segmentation. In2017 IEEE Visual Communications and Image Processing (VCIP), pages 1– 4, 2017

work page 2017
[4]

Dropout as a bayesian approxima- tion: representing model uncertainty in deep learning

Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approxima- tion: representing model uncertainty in deep learning. InProceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML’16, page 1050–1059. JMLR.org, 2016

work page 2016
[5]

Understanding the difficulty of training deep feedforward neural networks

Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. InInternational Conference on Artificial Intelligence and Statistics, 2010

work page 2010
[6]

Dilated balanced cross entropy loss for medical image segmentation.BMC Medical Imaging, 26, 02 2026

Seyed Hosseini and Mahdieh Soleymani. Dilated balanced cross entropy loss for medical image segmentation.BMC Medical Imaging, 26, 02 2026

work page 2026
[7]

Squeeze-and-excitation networks

Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion, pages 7132–7141, 2018

work page 2018
[8]

Contour-weighted loss for class- imbalanced image segmentation, 2024

Zhhengyong Huang and Yao Sui. Contour-weighted loss for class- imbalanced image segmentation, 2024

work page 2024
[9]

Convolutional neural networks for image- based high-throughput plant phenotyping: A review.Plant Phenomics, 2020:4152816, 2020

Yu Jiang and Changying Li. Convolutional neural networks for image- based high-throughput plant phenotyping: A review.Plant Phenomics, 2020:4152816, 2020

work page 2020
[10]

Adam: A Method for Stochastic Optimization

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization.CoRR, abs/1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[11]

Decoupled weight decay regulariza- tion

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regulariza- tion. InInternational Conference on Learning Representations, 2017

work page 2017
[12]

Semantic segmentation of agricultural images: A survey.Information Processing in Agriculture, 11(2):172–186, 2024

Zifei Luo, Wenzhu Yang, Yunfeng Yuan, Ruru Gou, and Xiaonan Li. Semantic segmentation of agricultural images: A survey.Information Processing in Agriculture, 11(2):172–186, 2024

work page 2024
[13]

Mixed precision training, 2018

Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, and Hao Wu. Mixed precision training, 2018

work page 2018
[14]

V-net: Fully convolutional neural networks for volumetric medical image segmentation

Fausto Milletari, Nassir Navab, and Seyed-Ahmad Ahmadi. V-net: Fully convolutional neural networks for volumetric medical image segmentation. 06 2016

work page 2016
[15]

U-net: Convolu- tional networks for biomedical image segmentation, 2015

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolu- tional networks for biomedical image segmentation, 2015

work page 2015
[16]

Un- certainty estimation and out-of-distribution detection for lidar scene semantic segmentation

Hanieh Shojaei Miandashti, Qianqian Zou, and Max Mehltretter. Un- certainty estimation and out-of-distribution detection for lidar scene semantic segmentation. InComputer Vision – ECCV 2024 Workshops: Milan, Italy, September 29–October 4, 2024, Proceedings, Part VII, page 116–131, Berlin, Heidelberg, 2024. Springer-Verlag

work page 2024
[17]

Joint depth-segmentation learning with segment priors for non-contact seedling height and stem thickness estimation.Eng

Lei Song, Bo Jiang, and Huaibo Song. Joint depth-segmentation learning with segment priors for non-contact seedling height and stem thickness estimation.Eng. Appl. Artif. Intell., 159(PA), November 2025

work page 2025
[18]

Cbam: Convolutional block attention module

Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. Cbam: Convolutional block attention module. InComputer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part VII, page 3–19, Berlin, Heidelberg, 2018. Springer-Verlag

work page 2018
[19]

An uncertainty- aware domain adaptive semantic segmentation framework.Autonomous Intelligent Systems, 4, 07 2024

Huilin Yin, Pengyu Wang, Boyu Liu, and Jun Yan. An uncertainty- aware domain adaptive semantic segmentation framework.Autonomous Intelligent Systems, 4, 07 2024

work page 2024
[20]

Attention-based multi-kernelized and boundary-aware network for image semantic segmentation.Neurocomputing, 597:127988, 2024

Xuanchen Zhou, Gengshen Wu, Xin Sun, Pengpeng Hu, and Yi Liu. Attention-based multi-kernelized and boundary-aware network for image semantic segmentation.Neurocomputing, 597:127988, 2024

work page 2024

[1] [1]

Deivalakshmi

Shamim Banu and S. Deivalakshmi. Enhancing leaf area segmentation by using attention gates and knowledge distillation in unet architecture. Journal of Telecommunications and Information Technology, 101:51–62, 09 2025

work page 2025

[2] [2]

Albumentations: fast and flexible image augmentations, 09 2018

Alexander Buslaev, Alex Parinov, Eugene Khvedchenya, Vladimir Iglovikov, and Alexandr Kalinin. Albumentations: fast and flexible image augmentations, 09 2018

work page 2018

[3] [3]

Linknet: Exploiting encoder representations for efficient semantic segmentation

Abhishek Chaurasia and Eugenio Culurciello. Linknet: Exploiting encoder representations for efficient semantic segmentation. In2017 IEEE Visual Communications and Image Processing (VCIP), pages 1– 4, 2017

work page 2017

[4] [4]

Dropout as a bayesian approxima- tion: representing model uncertainty in deep learning

Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approxima- tion: representing model uncertainty in deep learning. InProceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML’16, page 1050–1059. JMLR.org, 2016

work page 2016

[5] [5]

Understanding the difficulty of training deep feedforward neural networks

Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. InInternational Conference on Artificial Intelligence and Statistics, 2010

work page 2010

[6] [6]

Dilated balanced cross entropy loss for medical image segmentation.BMC Medical Imaging, 26, 02 2026

Seyed Hosseini and Mahdieh Soleymani. Dilated balanced cross entropy loss for medical image segmentation.BMC Medical Imaging, 26, 02 2026

work page 2026

[7] [7]

Squeeze-and-excitation networks

Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion, pages 7132–7141, 2018

work page 2018

[8] [8]

Contour-weighted loss for class- imbalanced image segmentation, 2024

Zhhengyong Huang and Yao Sui. Contour-weighted loss for class- imbalanced image segmentation, 2024

work page 2024

[9] [9]

Convolutional neural networks for image- based high-throughput plant phenotyping: A review.Plant Phenomics, 2020:4152816, 2020

Yu Jiang and Changying Li. Convolutional neural networks for image- based high-throughput plant phenotyping: A review.Plant Phenomics, 2020:4152816, 2020

work page 2020

[10] [10]

Adam: A Method for Stochastic Optimization

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization.CoRR, abs/1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[11] [11]

Decoupled weight decay regulariza- tion

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regulariza- tion. InInternational Conference on Learning Representations, 2017

work page 2017

[12] [12]

Semantic segmentation of agricultural images: A survey.Information Processing in Agriculture, 11(2):172–186, 2024

Zifei Luo, Wenzhu Yang, Yunfeng Yuan, Ruru Gou, and Xiaonan Li. Semantic segmentation of agricultural images: A survey.Information Processing in Agriculture, 11(2):172–186, 2024

work page 2024

[13] [13]

Mixed precision training, 2018

Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, and Hao Wu. Mixed precision training, 2018

work page 2018

[14] [14]

V-net: Fully convolutional neural networks for volumetric medical image segmentation

Fausto Milletari, Nassir Navab, and Seyed-Ahmad Ahmadi. V-net: Fully convolutional neural networks for volumetric medical image segmentation. 06 2016

work page 2016

[15] [15]

U-net: Convolu- tional networks for biomedical image segmentation, 2015

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolu- tional networks for biomedical image segmentation, 2015

work page 2015

[16] [16]

Un- certainty estimation and out-of-distribution detection for lidar scene semantic segmentation

Hanieh Shojaei Miandashti, Qianqian Zou, and Max Mehltretter. Un- certainty estimation and out-of-distribution detection for lidar scene semantic segmentation. InComputer Vision – ECCV 2024 Workshops: Milan, Italy, September 29–October 4, 2024, Proceedings, Part VII, page 116–131, Berlin, Heidelberg, 2024. Springer-Verlag

work page 2024

[17] [17]

Joint depth-segmentation learning with segment priors for non-contact seedling height and stem thickness estimation.Eng

Lei Song, Bo Jiang, and Huaibo Song. Joint depth-segmentation learning with segment priors for non-contact seedling height and stem thickness estimation.Eng. Appl. Artif. Intell., 159(PA), November 2025

work page 2025

[18] [18]

Cbam: Convolutional block attention module

Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. Cbam: Convolutional block attention module. InComputer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part VII, page 3–19, Berlin, Heidelberg, 2018. Springer-Verlag

work page 2018

[19] [19]

An uncertainty- aware domain adaptive semantic segmentation framework.Autonomous Intelligent Systems, 4, 07 2024

Huilin Yin, Pengyu Wang, Boyu Liu, and Jun Yan. An uncertainty- aware domain adaptive semantic segmentation framework.Autonomous Intelligent Systems, 4, 07 2024

work page 2024

[20] [20]

Attention-based multi-kernelized and boundary-aware network for image semantic segmentation.Neurocomputing, 597:127988, 2024

Xuanchen Zhou, Gengshen Wu, Xin Sun, Pengpeng Hu, and Yi Liu. Attention-based multi-kernelized and boundary-aware network for image semantic segmentation.Neurocomputing, 597:127988, 2024

work page 2024