Deep Active Learning for Axon-Myelin Segmentation on Histology Data

Christian S. Perone; Julien Cohen-Adad; Mathieu Boudreau; Melanie Lubrano di Scandalea

arxiv: 1907.05143 · v1 · pith:EKEMS3NVnew · submitted 2019-07-11 · 💻 cs.CV · cs.LG

Deep Active Learning for Axon-Myelin Segmentation on Histology Data

Melanie Lubrano di Scandalea , Christian S. Perone , Mathieu Boudreau , Julien Cohen-Adad This is my paper

Pith reviewed 2026-05-24 23:13 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords deep active learningmyelin segmentationhistologyU-Netuncertainty estimationelectron microscopyimage segmentationannotation reduction

0 comments

The pith

A U-Net reaches peak myelin segmentation accuracy after annotating just three uncertainty-selected histology images instead of fifteen random ones.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates a deep active learning approach for segmenting myelin in histology images from electron microscopy. By using uncertainty estimates from multiple forward passes with dropout in a U-Net, the method identifies which unlabeled images would most improve the model when annotated. On two small datasets of spinal cord and brain samples, this led to maximum performance with far fewer annotations than selecting images at random. This matters because creating ground truth labels for such images requires expert time at the pixel level, making large training sets impractical. The authors provide code to apply the framework to new datasets.

Core claim

Experiments on spinal cord and brain microscopic histology samples showed that the method reached a maximum Dice value after adding 3 uncertainty-selected samples to the initial training set, versus 15 randomly-selected samples, thereby significantly reducing the annotation effort.

What carries the argument

Overall uncertainty measure obtained by taking Monte Carlo samples while using Dropout regularization scheme in the U-Net to select which samples to annotate.

If this is right

The framework achieves high segmentation performance with very few labelled samples on realistic small datasets.
It works across different acquisition settings including Serial Block-Face Electron Microscopy and Transmitting Electron Microscopy.
Annotation effort for experts is significantly reduced for axon-myelin segmentation tasks.
The straightforward implementation supports fast and accurate segmentation on new biomedical datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar uncertainty sampling could lower labeling costs for other pixel-level biomedical segmentation problems beyond myelin.
The approach may enable labs with limited annotation resources to apply deep models to their own histology data more readily.
Further checks on whether uncertainty scores predict actual error reduction across additional modalities would test broader applicability.

Load-bearing premise

The uncertainty measure from Monte Carlo dropout samples reliably identifies the samples that most improve the segmentation model on these histology datasets.

What would settle it

A test showing that randomly selected samples produce equivalent or greater Dice score gains than uncertainty-selected samples when added in equal numbers to the same initial training sets.

Figures

Figures reproduced from arXiv: 1907.05143 by Christian S. Perone, Julien Cohen-Adad, Mathieu Boudreau, Melanie Lubrano di Scandalea.

**Figure 4.** Figure 4: Samples and Ground Truth of SBEM and TEM datasets [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗

**Figure 5.** Figure 5: Active Learning simulation: Dice coefficient on SBEM test set over 15 [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Active Learning simulation: Dice coefficient on TEM test set over 15 [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗

**Figure 7.** Figure 7: Uncertainty maps evolution over active learning iterations for the Dice loss function. [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 8.** Figure 8: Uncertainty maps evolution over active learning iterations for the Weighted binary cross-entropy loss function. [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗

**Figure 9.** Figure 9: U-Net architecture used for these experiments. [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗

read the original abstract

Semantic segmentation is a crucial task in biomedical image processing, which recent breakthroughs in deep learning have allowed to improve. However, deep learning methods in general are not yet widely used in practice since they require large amount of data for training complex models. This is particularly challenging for biomedical images, because data and ground truths are a scarce resource. Annotation efforts for biomedical images come with a real cost, since experts have to manually label images at pixel-level on samples usually containing many instances of the target anatomy (e.g. in histology samples: neurons, astrocytes, mitochondria, etc.). In this paper we provide a framework for Deep Active Learning applied to a real-world scenario. Our framework relies on the U-Net architecture and overall uncertainty measure to suggest which sample to annotate. It takes advantage of the uncertainty measure obtained by taking Monte Carlo samples while using Dropout regularization scheme. Experiments were done on spinal cord and brain microscopic histology samples to perform a myelin segmentation task. Two realistic small datasets of 14 and 24 images were used, from different acquisition settings (Serial Block-Face Electron Microscopy and Transmitting Electron Microscopy) and showed that our method reached a maximum Dice value after adding 3 uncertainty-selected samples to the initial training set, versus 15 randomly-selected samples, thereby significantly reducing the annotation effort. We focused on a plausible scenario and showed evidence that this straightforward implementation achieves a high segmentation performance with very few labelled samples. We believe our framework may benefit any biomedical researcher willing to obtain fast and accurate image segmentation on their own dataset. The code is freely available at https://github.com/neuropoly/deep-active-learning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Active learning cuts needed labels to 3 vs 15 on these tiny histology sets, but single-run results without variance leave the gain open to initial-set effects.

read the letter

The paper applies Monte Carlo dropout uncertainty sampling on a U-Net to axon-myelin segmentation and reports reaching peak Dice after adding just three selected images rather than fifteen random ones, on two small EM histology collections of 14 and 24 images total from spinal cord and brain. That specific empirical demonstration on realistic low-label data is the concrete new piece. They also release the code, which lets others reproduce or adapt the pipeline directly. The framing around expert annotation cost in histology is straightforward and matches a real constraint in the field. The work stays empirical and does not claim new theory or derivations. The main limitation is the experimental scale and reporting. With total pools this small, any active-learning curve is sensitive to which images land in the starting labeled set. The abstract gives no mention of repeated trials, different random seeds, error bars, or statistical comparison, so the 3-versus-15 gap could reflect a favorable split rather than consistent superiority of the uncertainty measure. No other selection strategies or stronger baselines appear to be tested either. This is useful for readers already working on biomedical segmentation with scarce labels who want a quick starting framework and public code. They can treat the exact numbers as a case study rather than a settled result. The paper is coherent on its own terms and shows honest engagement with the practical problem, so it deserves peer review to let referees check the full protocol and ask for variance reporting or additional controls.

Referee Report

3 major / 2 minor

Summary. The paper presents a deep active learning framework that combines a U-Net architecture with Monte Carlo Dropout-based uncertainty sampling to select histology images for annotation in an axon-myelin segmentation task. On two small datasets (14 spinal-cord and 24 brain images acquired under different EM modalities), the method is reported to reach its maximum Dice score after the addition of only 3 uncertainty-selected samples to an initial training set, versus 15 randomly selected samples.

Significance. If the performance gap is shown to be robust, the work would offer a practical, low-cost route to high-accuracy myelin segmentation when expert pixel-level labels are scarce. The public release of the code is a clear strength that aids reproducibility and adoption in biomedical imaging.

major comments (3)

[Results / Experiments] Results section (and abstract): the central claim that uncertainty sampling reaches peak Dice after 3 samples versus 15 for random selection is presented without any report of multiple random initializations, standard deviations, or statistical tests. With total dataset sizes of only 14 and 24 images, performance curves are known to be sensitive to the composition of the initial labeled pool; absence of variance measures leaves the reported reduction in annotation effort unverified.
[Methods / Experiments] Experimental protocol: the manuscript gives no description of how the initial training set is chosen, the size of the unlabeled pool at each iteration, the precise stopping criterion for the active-learning loop, or the full hyper-parameter settings used for the U-Net and MC-Dropout sampling. These omissions make it impossible to reproduce or assess the reliability of the 3-versus-15 comparison.
[Results] Evaluation: only a single overall Dice value trajectory is shown; no per-class (axon vs. myelin) metrics, no comparison against other established active-learning acquisition functions (e.g., BALD, core-set), and no baseline using a non-Dropout uncertainty estimator are provided. This limits the ability to attribute the observed gain specifically to the MC-Dropout uncertainty measure.

minor comments (2)

[Abstract] The abstract states that the method 'significantly reducing the annotation effort' without supplying the actual Dice curves, the number of MC samples, or any quantitative measure of significance.
[Figures] Figure captions and axis labels should explicitly state the number of MC forward passes and the exact uncertainty aggregation formula used.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects for strengthening the experimental validation and reproducibility of our active learning framework. We address each major comment point-by-point below.

read point-by-point responses

Referee: [Results / Experiments] Results section (and abstract): the central claim that uncertainty sampling reaches peak Dice after 3 samples versus 15 for random selection is presented without any report of multiple random initializations, standard deviations, or statistical tests. With total dataset sizes of only 14 and 24 images, performance curves are known to be sensitive to the composition of the initial labeled pool; absence of variance measures leaves the reported reduction in annotation effort unverified.

Authors: We agree that variance reporting and statistical analysis are essential for small datasets. In the revised manuscript we will rerun all experiments across multiple random initializations (minimum 5 seeds), report mean Dice trajectories with standard deviations, and include paired statistical tests (e.g., Wilcoxon signed-rank) between uncertainty and random selection curves. revision: yes
Referee: [Methods / Experiments] Experimental protocol: the manuscript gives no description of how the initial training set is chosen, the size of the unlabeled pool at each iteration, the precise stopping criterion for the active-learning loop, or the full hyper-parameter settings used for the U-Net and MC-Dropout sampling. These omissions make it impossible to reproduce or assess the reliability of the 3-versus-15 comparison.

Authors: We acknowledge the protocol details were insufficiently specified. The revised Methods section will explicitly state: initial training set selection procedure and size, unlabeled pool composition at each step, stopping criterion (performance plateau or fixed budget), and complete hyper-parameter values for the U-Net (architecture, optimizer, learning rate, epochs) together with MC-Dropout settings (dropout probability, number of forward passes). revision: yes
Referee: [Results] Evaluation: only a single overall Dice value trajectory is shown; no per-class (axon vs. myelin) metrics, no comparison against other established active-learning acquisition functions (e.g., BALD, core-set), and no baseline using a non-Dropout uncertainty estimator are provided. This limits the ability to attribute the observed gain specifically to the MC-Dropout uncertainty measure.

Authors: We will add per-class (axon and myelin) Dice scores to the results. However, systematic comparisons against BALD, core-set, and non-Dropout estimators would require extensive new experiments that exceed the scope of the current work, which demonstrates a simple, reproducible MC-Dropout baseline. We will note this limitation explicitly and indicate that such comparisons are left for future investigation. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical active-learning experiments

full rationale

The paper reports experimental results comparing uncertainty sampling (MC Dropout on U-Net) versus random selection on two small histology datasets (14 and 24 images). No derivation chain, first-principles equations, fitted parameters renamed as predictions, or self-citation load-bearing steps exist. The headline claim (max Dice after +3 uncertainty samples vs +15 random) is a direct empirical outcome, not reduced to inputs by construction. This matches the default non-finding for experimental papers without mathematical derivations.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are introduced; the work applies standard U-Net architecture and Monte Carlo Dropout uncertainty estimation from prior literature to a new domain.

pith-pipeline@v0.9.0 · 5835 in / 1035 out tokens · 34820 ms · 2026-05-24T23:13:00.896348+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

[1]

Mechanisms of white matter damage in multiple sclerosis

Hans Lassmann. Mechanisms of white matter damage in multiple sclerosis. Glia, 62(11):1816–1830, 2014

work page 2014
[2]

From de- myelination to remyelination: the road toward therapies for spinal cord injury

Florentia Papastefanaki and Rebecca Matsas. From de- myelination to remyelination: the road toward therapies for spinal cord injury. Glia, 63(7):1101–1125, July 2015

work page 2015
[3]

U-Net: Convolutional networks for biomedical image segmentation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: Convolutional networks for biomedical image segmentation. In Lecture Notes in Computer Science , pages 234–241. 2015

work page 2015
[4]

AxonDeepSeg: automatic axon and myelin seg- mentation from microscopy data using convolutional neural networks

Aldo Zaimi, Maxime Wabartha, Victor Herman, Pierre- Louis Antonsanti, Christian S Perone, and Julien Cohen- Adad. AxonDeepSeg: automatic axon and myelin seg- mentation from microscopy data using convolutional neural networks. Sci. Rep., 8(1):3816, February 2018

work page 2018
[5]

A Transfer-Learning approach to image segmentation across scanners by maximizing distribution similarity

Annegreet van Opbroek, M Arfan Ikram, Meike W Vernooij, and Marleen de Bruijne. A Transfer-Learning approach to image segmentation across scanners by maximizing distribution similarity. In Lecture Notes in Computer Science, pages 49–56. 2013

work page 2013
[6]

Weakly-and Semi-Supervised learning of a deep convolutional network for semantic image segmentation

George Papandreou, Liang-Chieh Chen, Kevin P Murphy, and Alan L Yuille. Weakly-and Semi-Supervised learning of a deep convolutional network for semantic image segmentation. In 2015 IEEE International Conference on Computer Vision (ICCV) , 2015

work page 2015
[7]

Active Learning

Burr Settles. Active Learning . Morgan & Claypool Publishers, July 2012

work page 2012
[8]

Suggestive annotation: A deep active learning framework for biomedical image segmentation

Lin Yang, Yizhe Zhang, Jianxu Chen, Siyuan Zhang, and Danny Z Chen. Suggestive annotation: A deep active learning framework for biomedical image segmentation. In Lecture Notes in Computer Science , pages 399–407. 2017

work page 2017
[9]

Membrane segmen- tation via active learning with deep networks

Utkarsh Gaur, Matthew Kourakis, Erin Newman-Smith, William Smith, and B S Manjunath. Membrane segmen- tation via active learning with deep networks. In 2016 IEEE International Conference on Image Processing (ICIP), 2016

work page 2016
[10]

The MNIST database of hand- written digits, 1998

Y Lecun and C Cortes. The MNIST database of hand- written digits, 1998. 8 Fig. 9. U-Net architecture used for these experiments

work page 1998
[11]

International skin imaging collaboration: Melanoma project website, 2017

ISIC. International skin imaging collaboration: Melanoma project website, 2017

work page 2017
[12]

Uncertainty in Deep Learning

Yarin Gal. Uncertainty in Deep Learning . PhD thesis, University of Cambridge, 2016

work page 2016
[13]

Deep bayesian active learning with image data

Yarin Gal, Riashat Islam, and Zoubin Ghahramani. Deep bayesian active learning with image data. 2016

work page 2016
[14]

Dropout: A simple way to pre- vent neural net- works from overﬁtting

N Srivastava, G Hinton, A Krizhevsky, I Sutskever, Salakhutdinov, and R. Dropout: A simple way to pre- vent neural net- works from overﬁtting. The Journal ofMachine Learning Research , 15(1):1929–1958, 2014

work page 1929
[15]

Active Deep Learning for Medical Imaging Segmentation

Marc Gorriz Blanch. Active Deep Learning for Medical Imaging Segmentation. PhD thesis, Universitat Politec- nica de Catalunya (UPC), 2017

work page 2017
[16]

High-resolution whole-brain staining for electron microscopic circuit reconstruction

Shawn Mikula and Winfried Denk. High-resolution whole-brain staining for electron microscopic circuit reconstruction. Nat. Methods, 12(6):541–546, June 2015

work page 2015
[17]

White matter microscopy database, Jun 2019

Julien Cohen-Adad, Mark Does, Tanguy DUV AL, Tim B Dyrby, Els Fieremans, Alexandru Foias, Harris Nami, Farshid Sepehrband, Nikola Stikov, Aldo Zaimi, and et al. White matter microscopy database, Jun 2019

work page 2019
[18]

Risk versus uncertainty in deep learning: Bayes, bootstrap and the dangers of dropout

Ian Osband. Risk versus uncertainty in deep learning: Bayes, bootstrap and the dangers of dropout. In Pro- ceedings of the NIPS* 2016 Workshop on Bayesian Deep Learning, 2016

work page 2016

[1] [1]

Mechanisms of white matter damage in multiple sclerosis

Hans Lassmann. Mechanisms of white matter damage in multiple sclerosis. Glia, 62(11):1816–1830, 2014

work page 2014

[2] [2]

From de- myelination to remyelination: the road toward therapies for spinal cord injury

Florentia Papastefanaki and Rebecca Matsas. From de- myelination to remyelination: the road toward therapies for spinal cord injury. Glia, 63(7):1101–1125, July 2015

work page 2015

[3] [3]

U-Net: Convolutional networks for biomedical image segmentation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: Convolutional networks for biomedical image segmentation. In Lecture Notes in Computer Science , pages 234–241. 2015

work page 2015

[4] [4]

AxonDeepSeg: automatic axon and myelin seg- mentation from microscopy data using convolutional neural networks

Aldo Zaimi, Maxime Wabartha, Victor Herman, Pierre- Louis Antonsanti, Christian S Perone, and Julien Cohen- Adad. AxonDeepSeg: automatic axon and myelin seg- mentation from microscopy data using convolutional neural networks. Sci. Rep., 8(1):3816, February 2018

work page 2018

[5] [5]

A Transfer-Learning approach to image segmentation across scanners by maximizing distribution similarity

Annegreet van Opbroek, M Arfan Ikram, Meike W Vernooij, and Marleen de Bruijne. A Transfer-Learning approach to image segmentation across scanners by maximizing distribution similarity. In Lecture Notes in Computer Science, pages 49–56. 2013

work page 2013

[6] [6]

Weakly-and Semi-Supervised learning of a deep convolutional network for semantic image segmentation

George Papandreou, Liang-Chieh Chen, Kevin P Murphy, and Alan L Yuille. Weakly-and Semi-Supervised learning of a deep convolutional network for semantic image segmentation. In 2015 IEEE International Conference on Computer Vision (ICCV) , 2015

work page 2015

[7] [7]

Active Learning

Burr Settles. Active Learning . Morgan & Claypool Publishers, July 2012

work page 2012

[8] [8]

Suggestive annotation: A deep active learning framework for biomedical image segmentation

Lin Yang, Yizhe Zhang, Jianxu Chen, Siyuan Zhang, and Danny Z Chen. Suggestive annotation: A deep active learning framework for biomedical image segmentation. In Lecture Notes in Computer Science , pages 399–407. 2017

work page 2017

[9] [9]

Membrane segmen- tation via active learning with deep networks

Utkarsh Gaur, Matthew Kourakis, Erin Newman-Smith, William Smith, and B S Manjunath. Membrane segmen- tation via active learning with deep networks. In 2016 IEEE International Conference on Image Processing (ICIP), 2016

work page 2016

[10] [10]

The MNIST database of hand- written digits, 1998

Y Lecun and C Cortes. The MNIST database of hand- written digits, 1998. 8 Fig. 9. U-Net architecture used for these experiments

work page 1998

[11] [11]

International skin imaging collaboration: Melanoma project website, 2017

ISIC. International skin imaging collaboration: Melanoma project website, 2017

work page 2017

[12] [12]

Uncertainty in Deep Learning

Yarin Gal. Uncertainty in Deep Learning . PhD thesis, University of Cambridge, 2016

work page 2016

[13] [13]

Deep bayesian active learning with image data

Yarin Gal, Riashat Islam, and Zoubin Ghahramani. Deep bayesian active learning with image data. 2016

work page 2016

[14] [14]

Dropout: A simple way to pre- vent neural net- works from overﬁtting

N Srivastava, G Hinton, A Krizhevsky, I Sutskever, Salakhutdinov, and R. Dropout: A simple way to pre- vent neural net- works from overﬁtting. The Journal ofMachine Learning Research , 15(1):1929–1958, 2014

work page 1929

[15] [15]

Active Deep Learning for Medical Imaging Segmentation

Marc Gorriz Blanch. Active Deep Learning for Medical Imaging Segmentation. PhD thesis, Universitat Politec- nica de Catalunya (UPC), 2017

work page 2017

[16] [16]

High-resolution whole-brain staining for electron microscopic circuit reconstruction

Shawn Mikula and Winfried Denk. High-resolution whole-brain staining for electron microscopic circuit reconstruction. Nat. Methods, 12(6):541–546, June 2015

work page 2015

[17] [17]

White matter microscopy database, Jun 2019

Julien Cohen-Adad, Mark Does, Tanguy DUV AL, Tim B Dyrby, Els Fieremans, Alexandru Foias, Harris Nami, Farshid Sepehrband, Nikola Stikov, Aldo Zaimi, and et al. White matter microscopy database, Jun 2019

work page 2019

[18] [18]

Risk versus uncertainty in deep learning: Bayes, bootstrap and the dangers of dropout

Ian Osband. Risk versus uncertainty in deep learning: Bayes, bootstrap and the dangers of dropout. In Pro- ceedings of the NIPS* 2016 Workshop on Bayesian Deep Learning, 2016

work page 2016