pith. sign in

arxiv: 1907.05143 · v1 · pith:EKEMS3NVnew · submitted 2019-07-11 · 💻 cs.CV · cs.LG

Deep Active Learning for Axon-Myelin Segmentation on Histology Data

Pith reviewed 2026-05-24 23:13 UTC · model grok-4.3

classification 💻 cs.CV cs.LG
keywords deep active learningmyelin segmentationhistologyU-Netuncertainty estimationelectron microscopyimage segmentationannotation reduction
0
0 comments X

The pith

A U-Net reaches peak myelin segmentation accuracy after annotating just three uncertainty-selected histology images instead of fifteen random ones.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates a deep active learning approach for segmenting myelin in histology images from electron microscopy. By using uncertainty estimates from multiple forward passes with dropout in a U-Net, the method identifies which unlabeled images would most improve the model when annotated. On two small datasets of spinal cord and brain samples, this led to maximum performance with far fewer annotations than selecting images at random. This matters because creating ground truth labels for such images requires expert time at the pixel level, making large training sets impractical. The authors provide code to apply the framework to new datasets.

Core claim

Experiments on spinal cord and brain microscopic histology samples showed that the method reached a maximum Dice value after adding 3 uncertainty-selected samples to the initial training set, versus 15 randomly-selected samples, thereby significantly reducing the annotation effort.

What carries the argument

Overall uncertainty measure obtained by taking Monte Carlo samples while using Dropout regularization scheme in the U-Net to select which samples to annotate.

If this is right

  • The framework achieves high segmentation performance with very few labelled samples on realistic small datasets.
  • It works across different acquisition settings including Serial Block-Face Electron Microscopy and Transmitting Electron Microscopy.
  • Annotation effort for experts is significantly reduced for axon-myelin segmentation tasks.
  • The straightforward implementation supports fast and accurate segmentation on new biomedical datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar uncertainty sampling could lower labeling costs for other pixel-level biomedical segmentation problems beyond myelin.
  • The approach may enable labs with limited annotation resources to apply deep models to their own histology data more readily.
  • Further checks on whether uncertainty scores predict actual error reduction across additional modalities would test broader applicability.

Load-bearing premise

The uncertainty measure from Monte Carlo dropout samples reliably identifies the samples that most improve the segmentation model on these histology datasets.

What would settle it

A test showing that randomly selected samples produce equivalent or greater Dice score gains than uncertainty-selected samples when added in equal numbers to the same initial training sets.

Figures

Figures reproduced from arXiv: 1907.05143 by Christian S. Perone, Julien Cohen-Adad, Mathieu Boudreau, Melanie Lubrano di Scandalea.

Figure 1
Figure 1. Figure 1: Data Variability for axon-myelin histologies acquired with either [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 4
Figure 4. Figure 4: Samples and Ground Truth of SBEM and TEM datasets [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Active Learning simulation: Dice coefficient on SBEM test set over 15 [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Active Learning simulation: Dice coefficient on TEM test set over 15 [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Uncertainty maps evolution over active learning iterations for the Dice loss function. [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Uncertainty maps evolution over active learning iterations for the Weighted binary cross-entropy loss function. [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: U-Net architecture used for these experiments. [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗
read the original abstract

Semantic segmentation is a crucial task in biomedical image processing, which recent breakthroughs in deep learning have allowed to improve. However, deep learning methods in general are not yet widely used in practice since they require large amount of data for training complex models. This is particularly challenging for biomedical images, because data and ground truths are a scarce resource. Annotation efforts for biomedical images come with a real cost, since experts have to manually label images at pixel-level on samples usually containing many instances of the target anatomy (e.g. in histology samples: neurons, astrocytes, mitochondria, etc.). In this paper we provide a framework for Deep Active Learning applied to a real-world scenario. Our framework relies on the U-Net architecture and overall uncertainty measure to suggest which sample to annotate. It takes advantage of the uncertainty measure obtained by taking Monte Carlo samples while using Dropout regularization scheme. Experiments were done on spinal cord and brain microscopic histology samples to perform a myelin segmentation task. Two realistic small datasets of 14 and 24 images were used, from different acquisition settings (Serial Block-Face Electron Microscopy and Transmitting Electron Microscopy) and showed that our method reached a maximum Dice value after adding 3 uncertainty-selected samples to the initial training set, versus 15 randomly-selected samples, thereby significantly reducing the annotation effort. We focused on a plausible scenario and showed evidence that this straightforward implementation achieves a high segmentation performance with very few labelled samples. We believe our framework may benefit any biomedical researcher willing to obtain fast and accurate image segmentation on their own dataset. The code is freely available at https://github.com/neuropoly/deep-active-learning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper presents a deep active learning framework that combines a U-Net architecture with Monte Carlo Dropout-based uncertainty sampling to select histology images for annotation in an axon-myelin segmentation task. On two small datasets (14 spinal-cord and 24 brain images acquired under different EM modalities), the method is reported to reach its maximum Dice score after the addition of only 3 uncertainty-selected samples to an initial training set, versus 15 randomly selected samples.

Significance. If the performance gap is shown to be robust, the work would offer a practical, low-cost route to high-accuracy myelin segmentation when expert pixel-level labels are scarce. The public release of the code is a clear strength that aids reproducibility and adoption in biomedical imaging.

major comments (3)
  1. [Results / Experiments] Results section (and abstract): the central claim that uncertainty sampling reaches peak Dice after 3 samples versus 15 for random selection is presented without any report of multiple random initializations, standard deviations, or statistical tests. With total dataset sizes of only 14 and 24 images, performance curves are known to be sensitive to the composition of the initial labeled pool; absence of variance measures leaves the reported reduction in annotation effort unverified.
  2. [Methods / Experiments] Experimental protocol: the manuscript gives no description of how the initial training set is chosen, the size of the unlabeled pool at each iteration, the precise stopping criterion for the active-learning loop, or the full hyper-parameter settings used for the U-Net and MC-Dropout sampling. These omissions make it impossible to reproduce or assess the reliability of the 3-versus-15 comparison.
  3. [Results] Evaluation: only a single overall Dice value trajectory is shown; no per-class (axon vs. myelin) metrics, no comparison against other established active-learning acquisition functions (e.g., BALD, core-set), and no baseline using a non-Dropout uncertainty estimator are provided. This limits the ability to attribute the observed gain specifically to the MC-Dropout uncertainty measure.
minor comments (2)
  1. [Abstract] The abstract states that the method 'significantly reducing the annotation effort' without supplying the actual Dice curves, the number of MC samples, or any quantitative measure of significance.
  2. [Figures] Figure captions and axis labels should explicitly state the number of MC forward passes and the exact uncertainty aggregation formula used.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects for strengthening the experimental validation and reproducibility of our active learning framework. We address each major comment point-by-point below.

read point-by-point responses
  1. Referee: [Results / Experiments] Results section (and abstract): the central claim that uncertainty sampling reaches peak Dice after 3 samples versus 15 for random selection is presented without any report of multiple random initializations, standard deviations, or statistical tests. With total dataset sizes of only 14 and 24 images, performance curves are known to be sensitive to the composition of the initial labeled pool; absence of variance measures leaves the reported reduction in annotation effort unverified.

    Authors: We agree that variance reporting and statistical analysis are essential for small datasets. In the revised manuscript we will rerun all experiments across multiple random initializations (minimum 5 seeds), report mean Dice trajectories with standard deviations, and include paired statistical tests (e.g., Wilcoxon signed-rank) between uncertainty and random selection curves. revision: yes

  2. Referee: [Methods / Experiments] Experimental protocol: the manuscript gives no description of how the initial training set is chosen, the size of the unlabeled pool at each iteration, the precise stopping criterion for the active-learning loop, or the full hyper-parameter settings used for the U-Net and MC-Dropout sampling. These omissions make it impossible to reproduce or assess the reliability of the 3-versus-15 comparison.

    Authors: We acknowledge the protocol details were insufficiently specified. The revised Methods section will explicitly state: initial training set selection procedure and size, unlabeled pool composition at each step, stopping criterion (performance plateau or fixed budget), and complete hyper-parameter values for the U-Net (architecture, optimizer, learning rate, epochs) together with MC-Dropout settings (dropout probability, number of forward passes). revision: yes

  3. Referee: [Results] Evaluation: only a single overall Dice value trajectory is shown; no per-class (axon vs. myelin) metrics, no comparison against other established active-learning acquisition functions (e.g., BALD, core-set), and no baseline using a non-Dropout uncertainty estimator are provided. This limits the ability to attribute the observed gain specifically to the MC-Dropout uncertainty measure.

    Authors: We will add per-class (axon and myelin) Dice scores to the results. However, systematic comparisons against BALD, core-set, and non-Dropout estimators would require extensive new experiments that exceed the scope of the current work, which demonstrates a simple, reproducible MC-Dropout baseline. We will note this limitation explicitly and indicate that such comparisons are left for future investigation. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical active-learning experiments

full rationale

The paper reports experimental results comparing uncertainty sampling (MC Dropout on U-Net) versus random selection on two small histology datasets (14 and 24 images). No derivation chain, first-principles equations, fitted parameters renamed as predictions, or self-citation load-bearing steps exist. The headline claim (max Dice after +3 uncertainty samples vs +15 random) is a direct empirical outcome, not reduced to inputs by construction. This matches the default non-finding for experimental papers without mathematical derivations.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are introduced; the work applies standard U-Net architecture and Monte Carlo Dropout uncertainty estimation from prior literature to a new domain.

pith-pipeline@v0.9.0 · 5835 in / 1035 out tokens · 34820 ms · 2026-05-24T23:13:00.896348+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

  1. [1]

    Mechanisms of white matter damage in multiple sclerosis

    Hans Lassmann. Mechanisms of white matter damage in multiple sclerosis. Glia, 62(11):1816–1830, 2014

  2. [2]

    From de- myelination to remyelination: the road toward therapies for spinal cord injury

    Florentia Papastefanaki and Rebecca Matsas. From de- myelination to remyelination: the road toward therapies for spinal cord injury. Glia, 63(7):1101–1125, July 2015

  3. [3]

    U-Net: Convolutional networks for biomedical image segmentation

    Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: Convolutional networks for biomedical image segmentation. In Lecture Notes in Computer Science , pages 234–241. 2015

  4. [4]

    AxonDeepSeg: automatic axon and myelin seg- mentation from microscopy data using convolutional neural networks

    Aldo Zaimi, Maxime Wabartha, Victor Herman, Pierre- Louis Antonsanti, Christian S Perone, and Julien Cohen- Adad. AxonDeepSeg: automatic axon and myelin seg- mentation from microscopy data using convolutional neural networks. Sci. Rep., 8(1):3816, February 2018

  5. [5]

    A Transfer-Learning approach to image segmentation across scanners by maximizing distribution similarity

    Annegreet van Opbroek, M Arfan Ikram, Meike W Vernooij, and Marleen de Bruijne. A Transfer-Learning approach to image segmentation across scanners by maximizing distribution similarity. In Lecture Notes in Computer Science, pages 49–56. 2013

  6. [6]

    Weakly-and Semi-Supervised learning of a deep convolutional network for semantic image segmentation

    George Papandreou, Liang-Chieh Chen, Kevin P Murphy, and Alan L Yuille. Weakly-and Semi-Supervised learning of a deep convolutional network for semantic image segmentation. In 2015 IEEE International Conference on Computer Vision (ICCV) , 2015

  7. [7]

    Active Learning

    Burr Settles. Active Learning . Morgan & Claypool Publishers, July 2012

  8. [8]

    Suggestive annotation: A deep active learning framework for biomedical image segmentation

    Lin Yang, Yizhe Zhang, Jianxu Chen, Siyuan Zhang, and Danny Z Chen. Suggestive annotation: A deep active learning framework for biomedical image segmentation. In Lecture Notes in Computer Science , pages 399–407. 2017

  9. [9]

    Membrane segmen- tation via active learning with deep networks

    Utkarsh Gaur, Matthew Kourakis, Erin Newman-Smith, William Smith, and B S Manjunath. Membrane segmen- tation via active learning with deep networks. In 2016 IEEE International Conference on Image Processing (ICIP), 2016

  10. [10]

    The MNIST database of hand- written digits, 1998

    Y Lecun and C Cortes. The MNIST database of hand- written digits, 1998. 8 Fig. 9. U-Net architecture used for these experiments

  11. [11]

    International skin imaging collaboration: Melanoma project website, 2017

    ISIC. International skin imaging collaboration: Melanoma project website, 2017

  12. [12]

    Uncertainty in Deep Learning

    Yarin Gal. Uncertainty in Deep Learning . PhD thesis, University of Cambridge, 2016

  13. [13]

    Deep bayesian active learning with image data

    Yarin Gal, Riashat Islam, and Zoubin Ghahramani. Deep bayesian active learning with image data. 2016

  14. [14]

    Dropout: A simple way to pre- vent neural net- works from overfitting

    N Srivastava, G Hinton, A Krizhevsky, I Sutskever, Salakhutdinov, and R. Dropout: A simple way to pre- vent neural net- works from overfitting. The Journal ofMachine Learning Research , 15(1):1929–1958, 2014

  15. [15]

    Active Deep Learning for Medical Imaging Segmentation

    Marc Gorriz Blanch. Active Deep Learning for Medical Imaging Segmentation. PhD thesis, Universitat Politec- nica de Catalunya (UPC), 2017

  16. [16]

    High-resolution whole-brain staining for electron microscopic circuit reconstruction

    Shawn Mikula and Winfried Denk. High-resolution whole-brain staining for electron microscopic circuit reconstruction. Nat. Methods, 12(6):541–546, June 2015

  17. [17]

    White matter microscopy database, Jun 2019

    Julien Cohen-Adad, Mark Does, Tanguy DUV AL, Tim B Dyrby, Els Fieremans, Alexandru Foias, Harris Nami, Farshid Sepehrband, Nikola Stikov, Aldo Zaimi, and et al. White matter microscopy database, Jun 2019

  18. [18]

    Risk versus uncertainty in deep learning: Bayes, bootstrap and the dangers of dropout

    Ian Osband. Risk versus uncertainty in deep learning: Bayes, bootstrap and the dangers of dropout. In Pro- ceedings of the NIPS* 2016 Workshop on Bayesian Deep Learning, 2016