Instance-Aware Pseudo-Labeling and Class-Focused Contrastive Learning for Weakly Supervised Domain Adaptive Segmentation of Electron Microscopy
Pith reviewed 2026-05-18 06:22 UTC · model grok-4.3
The pith
Detection-guided pseudo-labels improve EM domain-adaptive segmentation
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce a multitask learning framework that jointly conducts segmentation and center detection with a novel cross-teaching mechanism and class-focused cross-domain contrastive learning. We introduce segmentation self-training with a novel instance-aware pseudo-label (IPL) selection strategy. Unlike existing methods that typically rely on pixel-wise pseudo-label filtering, the IPL semantically selects reliable and diverse pseudo-labels with the help of the detection task.
What carries the argument
The instance-aware pseudo-label (IPL) selection strategy, which uses the center detection task to semantically select reliable and diverse pseudo-labels from unlabeled target domain regions.
If this is right
- Outperforms existing UDA and WDA methods on challenging datasets.
- Significantly narrows the performance gap with the supervised upper bound.
- Achieves substantial improvements over other UDA techniques even without point labels.
- Effectively utilizes incomplete and imprecise point annotations via multitask learning.
Where Pith is reading between the lines
- The IPL approach could be adapted for other instance segmentation tasks in medical or scientific imaging with limited labels.
- Multitask learning combining detection and segmentation may help in general for improving pseudo-label quality in domain adaptation.
- This method suggests a way to reduce annotation costs in cross-domain biological image analysis.
Load-bearing premise
The instance-aware pseudo-label (IPL) selection strategy, guided by the center detection task, can reliably identify accurate and diverse pseudo-labels from unlabeled image regions in the target domain.
What would settle it
A direct comparison on target domain images where full annotations are available showing that IPL-selected pseudo-labels have high error rates or low diversity would falsify the reliability of the selection strategy.
Figures
read the original abstract
Annotation-efficient segmentation of the numerous mitochondria instances from various electron microscopy (EM) images is highly valuable for biological and neuroscience research. Although unsupervised domain adaptation (UDA) methods can help mitigate domain shifts and reduce the high costs of annotating each domain, they typically have relatively low performance in practical applications. Thus, we investigate weakly supervised domain adaptation (WDA) that utilizes additional sparse point labels on the target domain, which require minimal annotation effort and minimal expert knowledge. To take full use of the incomplete and imprecise point annotations, we introduce a multitask learning framework that jointly conducts segmentation and center detection with a novel cross-teaching mechanism and class-focused cross-domain contrastive learning. While leveraging unlabeled image regions is essential, we introduce segmentation self-training with a novel instance-aware pseudo-label (IPL) selection strategy. Unlike existing methods that typically rely on pixel-wise pseudo-label filtering, the IPL semantically selects reliable and diverse pseudo-labels with the help of the detection task. Comprehensive validations and comparisons on challenging datasets demonstrate that our method outperforms existing UDA and WDA methods, significantly narrowing the performance gap with the supervised upper bound. Furthermore, under the UDA setting, our method also achieves substantial improvements over other UDA techniques.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a multitask weakly supervised domain adaptation (WDA) framework for mitochondria instance segmentation in electron microscopy (EM) images. It jointly trains segmentation and center detection heads with a cross-teaching mechanism, adds class-focused cross-domain contrastive learning, and introduces an instance-aware pseudo-label (IPL) selection strategy that uses detection outputs to choose reliable and diverse pseudo-labels from unlabeled target-domain regions instead of pixel-wise filtering. The central claim is that this approach outperforms prior UDA and WDA methods on challenging datasets while substantially narrowing the gap to a fully supervised upper bound; the method is also shown to improve results under the pure UDA setting.
Significance. If the performance claims hold under rigorous controls, the work would be significant for annotation-efficient biomedical image analysis. Sparse point labels on the target domain are far cheaper than dense masks, and the IPL strategy that couples detection with segmentation offers a concrete way to exploit unlabeled regions without the usual pitfalls of noisy pseudo-labels. The multitask cross-teaching and contrastive components are technically coherent and could generalize to other instance segmentation tasks with domain shift.
major comments (2)
- [Method (IPL selection) and Experiments] The central performance claims rest on the IPL selection strategy (described in the method section and abstract). The paper provides no quantitative evaluation of center-detection accuracy on target-domain instances (e.g., center localization error, precision/recall of detected centers, or failure cases for small/dense mitochondria). Without such analysis it is impossible to verify that the auxiliary detection head remains reliable enough under domain shift to avoid systematically biased pseudo-label selection.
- [Abstract and Experimental Results] The abstract asserts that the method 'significantly narrow[s] the performance gap with the supervised upper bound' and outperforms existing UDA/WDA methods, yet the provided text contains no details on the exact datasets, train/test splits, evaluation metrics (Dice, AJI, etc.), number of runs, or statistical significance tests. These omissions make it impossible to assess whether the reported gains are robust or sensitive to post-hoc hyper-parameter choices.
minor comments (2)
- [Method] Notation for the cross-teaching loss and the class-focused contrastive term should be introduced with explicit equations and variable definitions in the method section to improve readability.
- [Figures] Figure captions and axis labels in the qualitative results should explicitly state which domain (source/target) and which method each panel corresponds to.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. The comments highlight important aspects of the IPL strategy and experimental reporting that we have addressed in the revision.
read point-by-point responses
-
Referee: [Method (IPL selection) and Experiments] The central performance claims rest on the IPL selection strategy (described in the method section and abstract). The paper provides no quantitative evaluation of center-detection accuracy on target-domain instances (e.g., center localization error, precision/recall of detected centers, or failure cases for small/dense mitochondria). Without such analysis it is impossible to verify that the auxiliary detection head remains reliable enough under domain shift to avoid systematically biased pseudo-label selection.
Authors: We agree that quantitative analysis of the center-detection head on the target domain would provide direct evidence for the reliability of IPL selection. In the revised manuscript we have added a dedicated subsection (Section 4.4) reporting center localization error (in pixels), precision/recall of detected centers, and qualitative discussion of failure cases on small or densely packed mitochondria. These results, obtained under the same domain-shift conditions as the main experiments, show that the cross-teaching mechanism keeps detection accuracy sufficiently high to avoid systematic bias in pseudo-label selection. revision: yes
-
Referee: [Abstract and Experimental Results] The abstract asserts that the method 'significantly narrow[s] the performance gap with the supervised upper bound' and outperforms existing UDA/WDA methods, yet the provided text contains no details on the exact datasets, train/test splits, evaluation metrics (Dice, AJI, etc.), number of runs, or statistical significance tests. These omissions make it impossible to assess whether the reported gains are robust or sensitive to post-hoc hyper-parameter choices.
Authors: We acknowledge the omissions. The revised abstract now explicitly states the datasets (MitoEM and the additional EM volumes), train/test splits, metrics (Dice, AJI, and PQ), and that all results are averaged over three independent runs with standard deviation. In the experimental section we have added a new table summarizing mean and std across runs together with paired t-test p-values against the strongest baselines, confirming that the reported improvements are statistically significant and not sensitive to post-hoc choices. revision: yes
Circularity Check
Minor self-citation in related work but central multitask framework and IPL strategy are independently proposed and empirically validated
full rationale
The paper introduces a multitask segmentation-plus-center-detection framework with cross-teaching and a novel instance-aware pseudo-label selection strategy that uses detection outputs to filter reliable pseudo-labels. Performance claims rest on experimental comparisons against UDA/WDA baselines on EM datasets rather than any mathematical derivation that loops back to fitted parameters or self-defined quantities. No equations reduce predictions to inputs by construction, and the method is presented as a set of algorithmic choices whose effectiveness is tested externally. This is a standard empirical contribution with at most incidental self-citation that does not bear the load of the core claims.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Joint training of segmentation and center detection tasks via cross-teaching produces mutual performance gains.
- ad hoc to paper Instance-aware pseudo-label selection using detection outputs yields more reliable and diverse labels than pixel-wise filtering.
Reference graph
Works this paper leans on
-
[1]
This study develops an effective WDA method that sig- nificantly outperforms UDA methods and demonstrates comparable performance to supervised methods with minimal annotation efforts and knowledge
-
[2]
Rather than using pixel-wise pseudo-label selection, this study proposes a simple yet effective instance-aware JOURNALS TEMPLATE 3 Fig. 2. Overview of the proposed method for weakly-supervised cross-domain adaptation. Under a multitask learning framework, an auxiliary center detection task is utilized to achieve instance-aware pseudo-label selection for t...
-
[3]
A class-focused contrastive learning approach has been introduced to effectively learn domain-invariant features. II. RELATEDWORK A. Unsupervised Domain Adaptation Deep learning models trained on a specific domain often suffer from significant performance degradation when tested on datasets with shifted distributions. Although foundation models like the S...
-
[4]
The goal of the WDA is to learn a segmentation model that adapts well to the target domain
is the kernel bandwidth. The goal of the WDA is to learn a segmentation model that adapts well to the target domain. Model overview. An overview of our model is demon- strated in Fig. 2, in which we conduct multitask learning with a cross-task teaching mechanism. Our model takes an encoder- decoder architecturef D ◦f E with a segmentation headf S and a re...
-
[5]
in a resolution of 5×5×5nm 3, representing tissues from the mouse CA1 hippocampus region. This dataset was split into two subsets, each of which contains 165 image slices of size 768×1,024, for training and testing, respectively. MitoEM-R Cortex Data. The MitoEM Dataset [13] con- strains two subsets. The MitoEM-R subset was scanned using a multi-beam scan...
-
[6]
are 2.5D methods that take multiple slices as input, while others, including our model, simply conduct 2D segmentation. We also compare our method with typical foundation models, including SAM [27], Med-SA [28], and their interactive ver- sions, i.e., SAM (Interact), and Med-SA (Interact), which take center-points of all mitochondria instances as user pro...
-
[7]
auxiliary detection with pseudo-labeling, 2) instance-aware pseudo-labeling for segmentation, 3) class-focused contrastive learning. By comparing our full model with Model III in Table II, we have noticed a performance drop of 1.5% in the Dice coefficient when class-focused contrastive learning was removed. By further removing instance-aware pseudo-labeli...
-
[8]
Segmen- tation in large-scale cellular electron microscopy with deep learning: A literature survey,
A. Aswath, A. Alsahaf, B. N. Giepmans, and G. Azzopardi, “Segmen- tation in large-scale cellular electron microscopy with deep learning: A literature survey,”Medical Image Analysis, p. 102920, 2023
work page 2023
-
[9]
Learning for structured prediction using approximate subgradient descent with working sets,
A. Lucchi, Y . Li, and P. Fua, “Learning for structured prediction using approximate subgradient descent with working sets,” inProceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 1987–1994
work page 2013
-
[10]
Call to action to properly utilize electron microscopy to measure organelles to monitor disease,
K. Neikirk, E.-G. Lopez, A. G. Marshall, A. Alghanem, E. Krystofiak, B. Kula, N. Smith, J. Shao, P. Katti, and A. O. Hinton Jr, “Call to action to properly utilize electron microscopy to measure organelles to monitor disease,”European Journal of Cell Biology, p. 151365, 2023
work page 2023
-
[11]
Mitochondria in disease: changes in shapes and dynamics,
B. C. Jenkins, K. Neikirk, P. Katti, S. M. Claypool, A. Kirabo, M. R. McReynolds, and A. Hinton, “Mitochondria in disease: changes in shapes and dynamics,”Trends in Biochemical Sciences, 2024
work page 2024
-
[12]
J. Liu, J. Qi, X. Chen, Z. Li, B. Hong, H. Ma, G. Li, L. Shen, D. Liu, Y . Konget al., “Fear memory-associated synaptic and mitochondrial changes revealed by deep learning-based processing of electron mi- croscopy data,”Cell Reports, vol. 40, no. 5, 2022
work page 2022
-
[13]
U-net: Convolutional networks for biomedical image segmentation,
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” inInternational Conference on Medical Image Computing and Computer-Assisted Intervention, 2015, pp. 234–241
work page 2015
-
[14]
J. Peng and Z. Luo, “Cs-net: Instance-aware cellular segmentation with hierarchical dimension-decomposed convolutions and slice-attentive learning,”Knowledge-Based Systems, vol. 232, p. 107485, 2021
work page 2021
-
[15]
Adaptive template transformer for mitochondria segmentation in elec- tron microscopy images,
Y . Pan, N. Luo, R. Sun, M. Meng, T. Zhang, Z. Xiong, and Y . Zhang, “Adaptive template transformer for mitochondria segmentation in elec- tron microscopy images,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 21 474–21 484
work page 2023
-
[16]
S. K. Zhou, H. Greenspan, C. Davatzikos, J. S. Duncan, B. Van Gin- neken, A. Madabhushi, J. L. Prince, D. Rueckert, and R. M. Summers, “A review of deep learning in medical imaging: Imaging traits, technol- ogy trends, case studies with progress highlights, and future promises,” Proceedings of the IEEE, vol. 109, no. 5, pp. 820–838, 2021
work page 2021
-
[17]
Mask rearranging data augmentation for 3d mitochondria segmentation,
Q. Chen, M. Li, J. Li, B. Hu, and Z. Xiong, “Mask rearranging data augmentation for 3d mitochondria segmentation,” inInternational Conference on Medical Image Computing and Computer-Assisted Inter- vention. Springer, 2022, pp. 36–46
work page 2022
-
[18]
Evidential uncertainty-guided mitochondria segmentation for 3d em images,
R. Shi, L. Duan, T. Huang, and T. Jiang, “Evidential uncertainty-guided mitochondria segmentation for 3d em images,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 5, 2024, pp. 4847–4855
work page 2024
-
[19]
Weakly-supervised cross-domain segmentation of electron microscopy with sparse point annotation,
D. Qiu, S. Xiong, J. Yi, and J. Peng, “Weakly-supervised cross-domain segmentation of electron microscopy with sparse point annotation,” IEEE Transactions on Big Data, 2024
work page 2024
-
[20]
Mitoem dataset: Large-scale 3d mitochondria instance segmentation from em images,
D. Wei, Z. Lin, D. Franco-Barranco, N. Wendt, X. Liu, W. Yin, X. Huang, A. Gupta, W.-D. Jang, X. Wanget al., “Mitoem dataset: Large-scale 3d mitochondria instance segmentation from em images,” in International Conference on Medical Image Computing and Computer- Assisted Intervention. Springer, 2020, pp. 66–76
work page 2020
-
[21]
A theory of learning from different domains,
S. Ben-David, J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J. W. Vaughan, “A theory of learning from different domains,”Machine Learning, vol. 79, pp. 151–175, 2010
work page 2010
-
[22]
Generalized out-of-distribution detection: A survey,
J. Yang, K. Zhou, Y . Li, and Z. Liu, “Generalized out-of-distribution detection: A survey,”International Journal of Computer Vision, vol. 132, no. 12, pp. 5635–5662, 2024
work page 2024
-
[23]
Mitochondrial heterogeneity and home- ostasis through the lens of a neuron,
G. Pekkurnaz and X. Wang, “Mitochondrial heterogeneity and home- ostasis through the lens of a neuron,”Nature Metabolism, vol. 4, no. 7, pp. 802–812, 2022
work page 2022
-
[24]
Domain adaptation for medical image analysis: a survey,
H. Guan and M. Liu, “Domain adaptation for medical image analysis: a survey,”IEEE Transactions on Biomedical Engineering, vol. 69, no. 3, pp. 1173–1185, 2021
work page 2021
-
[25]
Unsupervised mitochondria segmentation in em images via domain adaptive multi-task learning,
J. Peng, J. Yi, and Z. Yuan, “Unsupervised mitochondria segmentation in em images via domain adaptive multi-task learning,”IEEE Journal of Selected Topics in Signal Processing, vol. 14, no. 6, pp. 1199–1209, 2020
work page 2020
-
[26]
Self-supervised augmentation consistency for adapting semantic segmentation,
N. Araslanov and S. Roth, “Self-supervised augmentation consistency for adapting semantic segmentation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 15 384–15 394
work page 2021
-
[27]
Uncertainty-aware label rectification for domain adaptive mitochondria segmentation,
S. Wu, C. Chen, Z. Xiong, X. Chen, and X. Sun, “Uncertainty-aware label rectification for domain adaptive mitochondria segmentation,” in24th International Conference on Medical Image Computing and Computer Assisted Intervention. Springer, 2021, pp. 191–200
work page 2021
-
[28]
Class-aware feature alignment for domain adaptative mitochondria segmentation,
D. Yin, W. Huang, Z. Xiong, and X. Chen, “Class-aware feature alignment for domain adaptative mitochondria segmentation,” inInterna- tional Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2023, pp. 238–248
work page 2023
-
[29]
Domain adaptive mitochondria segmentation via enforcing inter-section consistency,
W. Huang, X. Liu, Z. Cheng, Y . Zhang, and Z. Xiong, “Domain adaptive mitochondria segmentation via enforcing inter-section consistency,” in International Conference on Medical Image Computing and Computer- Assisted Intervention. Springer, 2022, pp. 89–98
work page 2022
-
[30]
Wda-net: Weakly-supervised domain adap- tive segmentation of electron microscopy,
D. Qiu, J. Yi, and J. Peng, “Wda-net: Weakly-supervised domain adap- tive segmentation of electron microscopy,” in2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2022, pp. 1132–1137
work page 2022
-
[31]
Weakly-supervised domain adaptive semantic segmentation with prototypical contrastive learning,
A. Das, Y . Xian, D. Dai, and B. Schiele, “Weakly-supervised domain adaptive semantic segmentation with prototypical contrastive learning,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 15 434–15 443
work page 2023
-
[32]
Bi-directional contrastive learning for domain adaptive semantic segmentation,
G. Lee, C. Eom, W. Lee, H. Park, and B. Ham, “Bi-directional contrastive learning for domain adaptive semantic segmentation,” in European Conference on Computer Vision. Springer, 2022, pp. 38– 55
work page 2022
-
[33]
Learning to adapt structured output space for semantic segmentation,
Y . Tsai, W. Hung, S. Schulter, K. Sohn, M. Yang, and M. Chandraker, “Learning to adapt structured output space for semantic segmentation,” inIEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7472–7481
work page 2018
-
[34]
A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y . Loet al., “Segment anything,” inProceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 4015–4026. JOURNALS TEMPLATE 11
work page 2023
-
[35]
Medical sam adapter: Adapting segment anything model for medical image segmentation,
J. Wu, Z. Wang, M. Hong, W. Ji, H. Fu, Y . Xu, M. Xu, and Y . Jin, “Medical sam adapter: Adapting segment anything model for medical image segmentation,”Medical image analysis, vol. 102, p. 103547, 2025
work page 2025
-
[36]
Learning transferable features with deep adaptation networks,
M. Long, Y . Cao, J. Wang, and M. Jordan, “Learning transferable features with deep adaptation networks,” inInternational Conference on Machine Learning. PMLR, 2015, pp. 97–105
work page 2015
-
[37]
Domain-adversarial training of neural networks,
Y . Ganin, E. Ustinova, H. Ajakan, P. Germain, and H. Larochelle, “Domain-adversarial training of neural networks,”Journal of Machine Learning Research, vol. 17, no. 1, pp. 2096–2030, 2016
work page 2096
-
[38]
T. Li, S. Roy, H. Zhou, H. Lu, and S. Lathuili `ere, “Contrast, stylize and adapt: Unsupervised contrastive learning framework for domain adaptive semantic segmentation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 4869–4879
work page 2023
-
[39]
Cycada: Cycle-consistent adversarial domain adapta- tion,
J. Hoffman, E. Tzeng, T. Park, J.-Y . Zhu, P. Isola, K. Saenko, A. Efros, and T. Darrell, “Cycada: Cycle-consistent adversarial domain adapta- tion,” inInternational Conference on Machine Learning. PMLR, 2018, pp. 1989–1998
work page 2018
-
[40]
Domain adaptation for semantic segmentation via class-balanced self-training,
Y . Zou, Z. Yu, B. Kumar, and J. Wang, “Domain adaptation for semantic segmentation via class-balanced self-training,” inEuropean Conference on Computer Vision, 2018, pp. 289–305
work page 2018
-
[41]
Domain adaptive semantic segmentation using weak labels,
S. Paul, Y .-H. Tsai, S. Schulter, A. K. Roy-Chowdhury, and M. Chan- draker, “Domain adaptive semantic segmentation using weak labels,” in 16th European Conference on Computer Vision. Springer, 2020, pp. 571–587
work page 2020
-
[42]
Dawn: Domain-adaptive weakly supervised nuclei segmentation via cross-task interactions,
Y . Zhang, Y . Wang, Z. Fang, H. Bian, L. Cai, Z. Wang, and Y . Zhang, “Dawn: Domain-adaptive weakly supervised nuclei segmentation via cross-task interactions,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 5, pp. 4753–4767, 2024
work page 2024
-
[43]
Domain adaptive box- supervised instance segmentation network for mitosis detection,
Y . Li, Y . Xue, L. Li, X. Zhang, and X. Qian, “Domain adaptive box- supervised instance segmentation network for mitosis detection,”IEEE Transactions on Medical Imaging, vol. 41, no. 9, pp. 2469–2485, 2022
work page 2022
-
[44]
Single-image crowd counting via multi-column convolutional neural network,
Y . Zhang, D. Zhou, S. Chen, S. Gao, and Y . Ma, “Single-image crowd counting via multi-column convolutional neural network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 589–597
work page 2016
-
[45]
Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks,
D.-H. Leeet al., “Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks,” inWorkshop on challenges in representation learning at International Conference on Machine Learning, vol. 3, no. 2, 2013, p. 896
work page 2013
-
[46]
A dataset and a technique for generalized nuclear segmen- tation for computational pathology,
N. Kumar, R. Verma, S. Sharma, S. Bhargava, A. Vahadane, and A. Sethi, “A dataset and a technique for generalized nuclear segmen- tation for computational pathology,”IEEE Transactions on Medical Imaging, vol. 36, no. 7, pp. 1550–1560, 2017
work page 2017
-
[47]
A. Kirillov, K. He, R. Girshick, C. Rother, and P. Doll ´ar, “Panoptic segmentation,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 9404–9413
work page 2019
-
[48]
Understanding the behaviour of contrastive loss,
F. Wang and H. Liu, “Understanding the behaviour of contrastive loss,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 2495–2504
work page 2021
-
[49]
The genetics and pathology of mitochondrial disease,
C. L. Alston, M. C. Rocha, N. Z. Lax, D. M. Turnbull, and R. W. Taylor, “The genetics and pathology of mitochondrial disease,”The Journal of pathology, vol. 241, no. 2, pp. 236–250, 2017
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.