A novel attention mechanism for noise-adaptive and robust segmentation of microtubules in microscopy images
Pith reviewed 2026-05-19 05:35 UTC · model grok-4.3
The pith
A noise-adaptive attention mechanism integrated into a residual U-Net segments microtubules accurately in noisy microscopy images.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors introduce ASE_Res_UNet, which places a noise-adaptive attention mechanism that extends the Squeeze-and-Excitation module into the decoder of a U-Net equipped with residual encoder blocks. This mechanism allows the network to adjust feature emphasis dynamically as noise levels change across images. A separate synthetic data generation pipeline supplies precise annotations for fine filaments despite noise and class imbalance. Systematic tests establish that ASE_Res_UNet outperforms its ablated variants, alternative attention modules, and other architectures on both synthetic and real microtubule datasets while remaining parameter-efficient and transferring to other curvilinear bio-
What carries the argument
The noise-adaptive attention mechanism, an extension of the Squeeze-and-Excitation module that dynamically scales channel responses according to estimated noise levels in the input image.
If this is right
- The model segments microtubules more accurately than ablated versions or competing attention mechanisms while using fewer parameters.
- It maintains competitive performance on newly curated real microscopy datasets without requiring large amounts of manual annotation.
- The same architecture transfers to segmentation of blood vessels and nerves across different imaging modalities.
- The approach reduces the impact of class imbalance and annotation difficulty for curvilinear structures in general.
Where Pith is reading between the lines
- If the synthetic data strategy proves robust across microscope types, labs could train high-performing filament models without collecting thousands of manually labeled real images.
- The lightweight design opens the possibility of running the segmentation live during time-lapse experiments on standard laboratory computers.
- The noise-adaptive idea could be tested on electron-microscopy images of other thin biological filaments where similar noise and density problems occur.
Load-bearing premise
The synthetic images generated for training match the noise statistics and filament geometry of real microscopy data closely enough that a model trained on them will generalize to real images without major performance loss.
What would settle it
A head-to-head test on a new set of real microscopy images in which ASE_Res_UNet shows clearly lower segmentation accuracy or higher error than a standard U-Net or an alternative attention model when both are trained only on the synthetic data.
Figures
read the original abstract
Segmenting cytoskeletal filaments in microscopy images is essential for studying their roles in cellular processes. However, this task is highly challenging due to the fine, densely packed, and intertwined nature of these structures. Imaging limitations further complicate analysis. While deep learning has advanced segmentation of large, well-defined biological structures, its performance often degrades under such adverse conditions. Additional challenges include obtaining precise annotations for curvilinear structures and managing severe class imbalance during training. We introduce a novel noise-adaptive attention mechanism that extends the Squeeze-and-Excitation (SE) module to dynamically adjust to varying noise levels. Integrated into a U-Net decoder with residual encoder blocks, this yields ASE_Res_UNet, a lightweight yet high-performance model. We also developed a synthetic dataset generation strategy that ensures accurate annotations of fine filaments in noisy images. We systematically evaluated loss functions and metrics to mitigate class imbalance, ensuring robust performance assessment. ASE_Res_UNet effectively segmented microtubules in noisy synthetic images, outperforming its ablated variants. It also demonstrated superior segmentation compared to models with alternative attention mechanisms or distinct architectures, while requiring fewer parameters, making it efficient for resource-constrained environments. Evaluation on a newly curated real microscopy dataset and a recently reannotated dataset highlighted ASE_Res_UNet's effectiveness in segmenting microtubules beyond synthetic images. For these datasets, ASE_Res_UNet was competitive with a recent synthetic data-driven approach that shares two cytoskeleton pretrained models. Importantly, ASE_Res_UNet showed strong transferability to other curvilinear structures (blood vessels and nerves) across diverse imaging conditions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes ASE_Res_UNet, a residual U-Net augmented with a novel noise-adaptive attention mechanism that extends the Squeeze-and-Excitation module to handle varying noise levels in microscopy images. It introduces a synthetic dataset generator for precise filament annotations under noise and class imbalance, systematically compares loss functions and metrics, and reports superior segmentation performance on noisy synthetic data versus ablated and baseline models, competitive results on curated real microtubule datasets, and transfer to blood vessels and nerves, all while using fewer parameters than alternatives.
Significance. If the empirical claims hold, the work provides a lightweight, noise-adaptive architecture that addresses practical challenges in segmenting fine curvilinear structures under realistic imaging conditions. The parameter efficiency and demonstrated cross-structure transferability are notable strengths for resource-limited settings in quantitative cell biology. Systematic loss-function and metric evaluations add value for handling severe class imbalance in filament segmentation tasks.
major comments (2)
- [Real data evaluation] Real microscopy dataset evaluation (abstract and corresponding results): No quantitative distributional comparison (e.g., MMD, Wasserstein distance on intensity histograms, curvature, or noise power spectra) is provided between the synthetic generator outputs and the real test images. This assumption is load-bearing for the generalization claim, since all architecture and loss ablations were performed only in the synthetic regime; without it, competitive real-data performance could reflect test-set characteristics rather than the noise-adaptive features.
- [Performance comparisons] Performance comparison results: The abstract and results report outperformance over ablated variants and alternative attention mechanisms without error bars, exact training/test split sizes, or statistical significance tests (e.g., paired t-tests or Wilcoxon tests on Dice/IoU). This weakens the robustness of the central empirical claims, particularly given the moderate soundness noted in the absence of these details.
minor comments (2)
- [Methods] The mathematical formulation of the noise-adaptive attention extension could be presented with an explicit equation or pseudocode in the methods to improve reproducibility of the dynamic adjustment to noise levels.
- [Figures] Figure captions for qualitative segmentation results should explicitly state the noise level or SNR range for each example to allow direct visual assessment of the claimed noise-adaptivity.
Simulated Author's Rebuttal
We thank the referee for their insightful comments, which help improve the clarity and robustness of our empirical evaluations. We address each major comment below and will incorporate the suggested revisions in the updated manuscript.
read point-by-point responses
-
Referee: [Real data evaluation] Real microscopy dataset evaluation (abstract and corresponding results): No quantitative distributional comparison (e.g., MMD, Wasserstein distance on intensity histograms, curvature, or noise power spectra) is provided between the synthetic generator outputs and the real test images. This assumption is load-bearing for the generalization claim, since all architecture and loss ablations were performed only in the synthetic regime; without it, competitive real-data performance could reflect test-set characteristics rather than the noise-adaptive features.
Authors: We recognize that providing quantitative measures of distributional similarity between the synthetic and real datasets would bolster the claims regarding the noise-adaptive attention mechanism's generalization. While the manuscript includes visual examples and reports competitive performance on real microtubule datasets, we did not perform metrics such as MMD or Wasserstein distances. We will add such analyses, for instance by comparing intensity distributions and noise characteristics, to the revised paper to more rigorously support the transferability. revision: yes
-
Referee: [Performance comparisons] Performance comparison results: The abstract and results report outperformance over ablated variants and alternative attention mechanisms without error bars, exact training/test split sizes, or statistical significance tests (e.g., paired t-tests or Wilcoxon tests on Dice/IoU). This weakens the robustness of the central empirical claims, particularly given the moderate soundness noted in the absence of these details.
Authors: We agree that the absence of error bars, precise split information, and statistical tests limits the strength of the performance claims. The manuscript reports average metrics from our experiments, but to enhance robustness, we will revise the results section to include standard deviations across repeated experiments, detail the training and test split sizes, and perform statistical significance testing (e.g., Wilcoxon signed-rank test) on key metrics like Dice and IoU, including the results in the updated version. revision: yes
Circularity Check
No circularity in empirical model evaluation and dataset strategy
full rationale
The paper presents an applied ML architecture (ASE_Res_UNet) and a synthetic data generator, with all central claims resting on direct empirical comparisons of segmentation metrics against ablated variants, alternative attention modules, and other architectures on both synthetic and real test sets. These measurements are independent of any fitted parameter being renamed as a prediction, and no derivation chain reduces by the paper's own equations or self-citations to a tautological input. The synthetic generation strategy is described as a practical means to obtain accurate annotations rather than a self-referential loop that forces the reported transfer results.
Axiom & Free-Parameter Ledger
free parameters (1)
- Network hyperparameters and loss weighting coefficients
axioms (1)
- domain assumption Synthetic images generated by the described strategy have noise and geometry statistics close enough to real data for the trained model to generalize.
Forward citations
Cited by 1 Pith paper
-
MTCurv: Deep learning for direct microtubule curvature mapping in noisy fluorescence microscopy images
MTCurv regresses pixel-wise microtubule curvature maps from noisy images using an attention-based residual U-Net trained on synthetic data with a gradient-aware loss.
Reference graph
Works this paper leans on
-
[1]
Retinal vessel segmentation results obtained using ASE_Res_UNet, and two advanced architectures, which differs in their core components, on a sample test image from the DRIVE dataset. (A) Input image; (B) corresponding ground truth; (C-E) predicted segmentations from (C) ASE_Res_UNet, (D) Pix2pix, and (E) TransUNet models. (A1-E1): whole images; (A2-E2): ...
-
[2]
La Ligue Nationale Contre le Cancer
Retinal nerve segmentation results obtained using ASE_Res_UNet and U-Net on a sample test image from the CORN-1 dataset. (A) Input image; (B) corresponding ground truth; (C-D) predicted segmentations from (C) ASE_Res_UNet, and (D) U-Net models. (A1-D1): whole image; (A2-D2): first zoomed-in region of interest (ROI) indicated in orange; (A3-D3) second zoom...
work page 2024
-
[3]
Multiscale vessel enhancement filtering
Frangi, A.F., et al. Multiscale vessel enhancement filtering. in Medical Image Computing and Computer-Assisted Intervention—MICCAI’98: First International Conference Cambridge, MA, USA, October 11–13, 1998 Proceedings
work page 1998
-
[4]
arXiv preprint arXiv:2011.01118,
Siddique, N., et al., U-Net and its variants for medical image segmentation: theory and applications. arXiv preprint arXiv:2011.01118,
-
[5]
Residual U-Net for Retinal Vessel Segmentation
Li, D., et al. Residual U-Net for Retinal Vessel Segmentation. in 2019 IEEE International Conference on Image Processing (ICIP)
work page 2019
-
[6]
elegans embryo to study cell division processes, in Methods in Cell Biology, H
Hattersley, N., et al., Chapter 9 - Employing the one-cell C. elegans embryo to study cell division processes, in Methods in Cell Biology, H. Maiato and M. Schuh, Editors. 2018, Academic Press. p. 185-231
work page 2018
-
[7]
Cueff, L., et al., Microtubule stiffening by doublecortin-domain protein ZYG-8 contributes to spindle orientation during <em>C. elegans</em> zygote division. bioRxiv, 2025: p. 2024.11.29.624795
work page 2025
-
[8]
A survey of loss functions for semantic segmentation
Jadon, S. A survey of loss functions for semantic segmentation. in 2020 IEEE conference on computational intelligence in bioinformatics and computational biology (CIBCB)
work page 2020
-
[9]
Rezaei-Dastjerdehei, M.R., A. Mijani, and E. Fatemizadeh. Addressing imbalance in multi-label classification using weighted cross entropy loss function. in 2020 27th national and 5th international iranian conference on biomedical engineering (ICBME)
work page 2020
-
[10]
Channel Attention Residual U-Net for Retinal Vessel Segmentation
Guo, C., et al. Channel Attention Residual U-Net for Retinal Vessel Segmentation. in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
work page 2021
-
[11]
Attention U-Net: Learning Where to Look for the Pancreas
Oktay, O., et al., Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999,
work page internal anchor Pith review Pith/arXiv arXiv
-
[12]
Wu, C., Y. Zou, and Z. Yang, U-GAN: Generative Adversarial Networks with U-Net for Retinal Vessel Segmentation. 2019 14th International Conference on Computer Science & Education (ICCSE), 2019: p. 642-646
work page 2019
-
[13]
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
Chen, J., et al., Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306,
work page internal anchor Pith review Pith/arXiv arXiv
-
[14]
128: p. 107454. ASE_Res_UNet For Curvilinear Structure Segmentation in Biomedical Images 34 Appendices A- Cytosim Table A.1. Objects and their parameters of Cytosim simulations Object type Characteristic parameters Reference Ellipse Radii: 24.5 µm and 16.5 µm; Viscosity: draw from uniform distribution of values between 4 and 5 Pa.s. Daniels et al., 2006 2...
work page 2006
-
[15]
and 6.8 µm (for 10); Rigidity: 50 pN.µm2; Position: 60° fan distribution. Fibre, type #2 (astral) Activity: dynamic; Number per aster: 65 to 85 (random choice); Initial length 8 ± 6 µm; Rigidity: draw from Gaussian distribution of mean equal 40 pN.µm2 and variance equal 5 pN.µm2; Position: 240° aleatory fan distribution; Growing force: 5 pN; Minimal lengt...
work page 1997
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.