Data Synthesis Improves 3D Myotube Instance Segmentation

David Exler; John Jbeily; Mario Vitacolonna; Markus Reischl; Martin Kr\"uger; Nils Friederich; Ralf Mikut; R\"udiger Rudolf

arxiv: 2604.14720 · v1 · submitted 2026-04-16 · 💻 cs.CV

Data Synthesis Improves 3D Myotube Instance Segmentation

David Exler , Nils Friederich , Martin Kr\"uger , John Jbeily , Mario Vitacolonna , R\"udiger Rudolf , Ralf Mikut , Markus Reischl This is my paper

Pith reviewed 2026-05-10 11:39 UTC · model grok-4.3

classification 💻 cs.CV

keywords 3D instance segmentationsynthetic data generationmyotube analysisbiomedical image segmentationdomain adaptationU-Net architectureannotation scarcity

0 comments

The pith

A 3D U-Net trained only on synthetic myotube volumes achieves better instance segmentation on real microscopy data than established zero-shot models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that realistic synthetic data can replace scarce real annotations for training 3D segmentation models in biomedical imaging. Myotubes are hard to segment because no large labeled datasets exist, and pretrained models do not transfer well. By modeling the shapes and imaging artifacts of real myotubes, the authors create training data that lets a compact network learn to separate individual fibers in 3D volumes. If this works, researchers can quantify muscle fiber properties without manual labeling, speeding up studies of muscle disease and drug effects.

Core claim

The authors demonstrate that a geometry-driven synthesis pipeline, which generates myotube structures using polynomial centerlines, varying radii, branching, and ellipsoidal caps, combined with noise, artifacts, and domain adaptation, produces data sufficient to train a 3D U-Net with self-supervised pretraining. This model reaches a mean IPQ of 0.22 on real test images and outperforms three zero-shot segmentation baselines.

What carries the argument

The geometry-driven synthesis pipeline that models individual myotubes with polynomial centerlines, locally varying radii, branching structures, and ellipsoidal end caps, then renders them with realistic noise and CycleGAN domain adaptation.

If this is right

Researchers can perform 3D instance segmentation of myotubes without collecting and annotating real training data.
Quantitative morphological measurements like diameter and branching become feasible in annotation-scarce domains.
Self-supervised pretraining on synthetic data improves generalization for compact 3D networks.
Biophysics-inspired synthesis can serve as a template for other biological structures with similar data limitations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This approach could extend to segmenting other multinucleated or branching cells in 3D microscopy where annotations are expensive.
Future work might test if adding more biophysical parameters like contraction dynamics further improves performance.
The success suggests that domain-specific synthesis may reduce reliance on large public datasets in specialized biomedical tasks.
One could validate by comparing segmentation accuracy on myotubes grown under different conditions not represented in the synthesis.

Load-bearing premise

The synthetic volumes generated from polynomial centerlines and imaging artifacts match the statistical distribution of real myotube microscopy images closely enough for the model to generalize.

What would settle it

A clear falsifier would be if the trained model achieves substantially lower IPQ scores on a new real dataset collected with different microscope settings or myotube culture conditions not modeled in the synthesis pipeline.

Figures

Figures reproduced from arXiv: 2604.14720 by David Exler, John Jbeily, Mario Vitacolonna, Markus Reischl, Martin Kr\"uger, Nils Friederich, Ralf Mikut, R\"udiger Rudolf.

**Figure 2.** Figure 2: Examples of 2D slices. a): Real, b): Corresponding sparse manual annotation, c)-e): Segmentation [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 4.** Figure 4: Mean IPQ scores with standard error of the [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗

read the original abstract

Myotubes are multinucleated muscle fibers serving as key model systems for studying muscle physiology, disease mechanisms, and drug responses. Mechanistic studies and drug screening thereby rely on quantitative morphological readouts such as diameter, length, and branching degree, which in turn require precise three-dimensional instance segmentation. Yet established pretrained biomedical segmentation models fail to generalize to this domain due to the absence of large annotated myotube datasets. We introduce a geometry-driven synthesis pipeline that models individual myotubes via polynomial centerlines, locally varying radii, branching structures, and ellipsoidal end caps derived from real microscopy observations. Synthetic volumes are rendered with realistic noise, optical artifacts, and CycleGAN-based Domain Adaptation (DA). A compact 3D U-Net with self-supervised encoder pretraining, trained exclusively on synthetic data, achieves a mean IPQ of 0.22 on real data, significantly outperforming three established zero-shot segmentation models, demonstrating that biophysics-driven synthesis enables effective instance segmentation in annotation-scarce biomedical domains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Synthetic geometry plus CycleGAN lets a small U-Net beat zero-shot models on real myotube images, but the paper needs to show the synthetics actually match real variability.

read the letter

The paper shows that a geometry-based synthesis method for myotubes, followed by CycleGAN adaptation, allows a small 3D U-Net trained only on synthetic data to achieve better instance segmentation on real images than off-the-shelf zero-shot models. What is new is the combination of polynomial centerlines, varying radii, explicit branching, and ellipsoidal caps to generate realistic 3D structures, then rendering with noise and domain adaptation. This targets the specific morphology of myotubes drawn from real data observations. The self-supervised pretraining helps the compact network learn useful representations without labels. The reported mean IPQ of 0.22 on real data is a concrete outcome that demonstrates the approach can work for annotation-scarce problems in this area. The main limitation is that we only see the end-to-end performance. There are no reported comparisons of geometric or appearance statistics between synthetic and real volumes, such as diameter distributions or curvature measures. This leaves open whether the model generalizes because the synthesis matches the real distribution or for other reasons. The abstract also omits the test set size and any statistical details, which makes it harder to judge how reliable the improvement is. Readers working on 3D segmentation for biomedical applications, particularly those studying muscle fibers or similar elongated structures, will find this useful as a starting point for generating training data. It deserves serious peer review because it tackles a real practical problem with a describable method and shows a measurable gain, even if additional experiments on data fidelity would make the claims stronger. I recommend sending it out for review and asking the authors to add those distributional validation steps.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces a geometry-driven synthesis pipeline for generating 3D myotube volumes based on polynomial centerlines, locally varying radii, branching structures, and ellipsoidal end caps, with added noise, optical artifacts, and CycleGAN domain adaptation. A compact 3D U-Net with self-supervised encoder pretraining is trained exclusively on the resulting synthetic data and evaluated on real held-out microscopy volumes, where it reports a mean IPQ of 0.22 and outperforms three established zero-shot segmentation models.

Significance. If the distributional fidelity of the synthesis holds, the result is significant for annotation-scarce biomedical imaging domains: it shows that biophysics-informed procedural generation plus domain adaptation can substitute for real labeled data in training instance segmentation models for tubular structures. The combination of self-supervised pretraining with a compact architecture is a constructive methodological choice that improves data efficiency.

major comments (3)

[Abstract] Abstract: the central performance claim (mean IPQ of 0.22 with significant outperformance of zero-shot baselines) is presented without any accompanying information on test-set size, number of real volumes, standard deviation across instances, or statistical testing. This information is required to assess whether the reported gain is robust or could be driven by a small or atypical test set.
[Methods and Results] Methods (synthesis pipeline) and Results: no quantitative distributional checks are supplied to verify that the synthetic ensemble matches real myotube statistics. Quantities such as Kolmogorov-Smirnov or Wasserstein distances on diameter histograms, curvature distributions, branch angles, or intensity profiles between synthetic and real volumes are absent, leaving the generalization claim dependent on the downstream IPQ number alone.
[Methods] Methods (CycleGAN and noise model): the free parameters of the synthesis (polynomial degree, radius variation function, branching probability, cap eccentricity, CycleGAN hyperparameters) are listed but not ablated or subjected to sensitivity analysis. Without such controls it is impossible to determine which components are load-bearing for the observed IPQ improvement.

minor comments (2)

[Abstract] The acronym IPQ is used in the abstract without expansion; it should be defined at first use.
[Figures] Figure captions would benefit from explicit statements of what geometric or appearance features are being illustrated in each panel to aid readers in assessing synthesis realism.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We address each major point below and describe the revisions we will implement to improve transparency and rigor.

read point-by-point responses

Referee: [Abstract] Abstract: the central performance claim (mean IPQ of 0.22 with significant outperformance of zero-shot baselines) is presented without any accompanying information on test-set size, number of real volumes, standard deviation across instances, or statistical testing. This information is required to assess whether the reported gain is robust or could be driven by a small or atypical test set.

Authors: We agree that the abstract would benefit from these details to allow immediate assessment of robustness. The evaluation used a held-out set of real volumes, with per-volume IPQ scores and statistical comparisons already reported in the Results section. In the revised manuscript we will update the abstract to explicitly state the test-set size, include the standard deviation of the IPQ scores, and note the statistical test employed for the outperformance claim. revision: yes
Referee: [Methods and Results] Methods (synthesis pipeline) and Results: no quantitative distributional checks are supplied to verify that the synthetic ensemble matches real myotube statistics. Quantities such as Kolmogorov-Smirnov or Wasserstein distances on diameter histograms, curvature distributions, branch angles, or intensity profiles between synthetic and real volumes are absent, leaving the generalization claim dependent on the downstream IPQ number alone.

Authors: We acknowledge that explicit quantitative distributional comparisons would strengthen the evidence for synthetic-data fidelity. While the current work relies on visual inspection and downstream task performance, we will add these checks in the revision. Specifically, we will compute and report Kolmogorov-Smirnov and Wasserstein distances for diameter histograms, curvature distributions, branch angles, and intensity profiles between the synthetic and real volumes. revision: yes
Referee: [Methods] Methods (CycleGAN and noise model): the free parameters of the synthesis (polynomial degree, radius variation function, branching probability, cap eccentricity, CycleGAN hyperparameters) are listed but not ablated or subjected to sensitivity analysis. Without such controls it is impossible to determine which components are load-bearing for the observed IPQ improvement.

Authors: The synthesis parameters were selected from direct measurements on real microscopy data and established biophysical priors. To address the request for controls, we will include a sensitivity analysis in the revised Methods and Results. This analysis will vary the principal free parameters (polynomial degree, branching probability, and CycleGAN hyperparameters) and report the resulting IPQ scores on the real test set. revision: yes

Circularity Check

0 steps flagged

No circularity: performance claim independent of synthesis inputs

full rationale

The paper's central result is that a 3D U-Net trained only on synthetic volumes (generated via polynomial centerlines, varying radii, branching, ellipsoidal caps, noise, artifacts, and CycleGAN adaptation derived from real observations) achieves IPQ=0.22 on held-out real test images and outperforms zero-shot baselines. This evaluation metric is computed directly on real data separate from the synthesis process and does not reduce by construction to any fitted parameter, self-citation, or input quantity; no equations equate the reported performance to the synthesis parameters themselves. No uniqueness theorems, ansatzes smuggled via prior self-work, or renamings of known results are invoked. The derivation chain is therefore self-contained and externally falsifiable via the real-data test set.

Axiom & Free-Parameter Ledger

2 free parameters · 0 axioms · 0 invented entities

The central claim rests on the domain assumption that a small set of geometric primitives plus style transfer can approximate real myotube image statistics closely enough for generalization. No explicit free parameters are quantified in the abstract, but the synthesis model implicitly depends on choices for centerline polynomials, radius functions, branching rules, and CycleGAN training hyperparameters.

free parameters (2)

synthesis geometry parameters (polynomial degree, radius variation function, branching probability, cap eccentricity)
Chosen to match observations from real microscopy; specific values act as free parameters that control how closely synthetic volumes resemble real data.
CycleGAN and noise model hyperparameters
Control the realism of domain adaptation and imaging artifacts; their selection influences whether the trained network generalizes.

pith-pipeline@v0.9.0 · 5498 in / 1461 out tokens · 36392 ms · 2026-05-10T11:39:57.063019+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages

[1]

Cellpose-SAM: superhuman generalization for cellular segmentation

M. Pachitariu, M. Rariden, and C. Stringer. “Cellpose-SAM: superhuman generalization for cellular segmentation.” In:bioRxiv(2025)

work page 2025
[2]

Accurate and versatile 3D seg- mentation of plant tissues at cellular resolution

A. Wolny et al. “Accurate and versatile 3D seg- mentation of plant tissues at cellular resolution.” In:eLife9 (2020), e57613

work page 2020
[3]

Star-convex Polyhedra for 3D Object Detection and Segmentation in Mi- croscopy

M. Weigert et al. “Star-convex Polyhedra for 3D Object Detection and Segmentation in Mi- croscopy.” In:The IEEE Winter Conference on Applications of Computer Vision (WACV). 2020

work page 2020
[4]

Unsupervised GAN epoch selection for biomedical data synthesis

M. Böhland et al. “Unsupervised GAN epoch selection for biomedical data synthesis.” In: Current Directions in Biomedical Engineering. V ol. 9. 1. De Gruyter. 2023, pp. 467–470

work page 2023
[5]

Improving 3D deep learning seg- mentation with biophysically motivated cell syn- thesis

R. Bruch et al. “Improving 3D deep learning seg- mentation with biophysically motivated cell syn- thesis.” In:Communications Biology8.1 (2025), p. 43

work page 2025
[6]

Synthesis of large scale 3D mi- croscopic images of 3D cell cultures for training and benchmarking

R. Bruch et al. “Synthesis of large scale 3D mi- croscopic images of 3D cell cultures for training and benchmarking.” In:PLOS ONE18.3 (2023), e0283828

work page 2023
[7]

U- Net: Convolutional Networks for Biomedical Im- age Segmentation

O. Ronneberger, P. Fischer, and T. Brox. “U- Net: Convolutional Networks for Biomedical Im- age Segmentation.” In:Medical Image Comput- ing and Computer-Assisted Intervention. 2015, pp. 234–241

work page 2015
[8]

S. F. Gilbert.Developmental biology. 10th ed. Sunderland, Massachusetts: Sinauer Associates, 2014

work page 2014
[9]

Behavior of cross striated muscle in tissue cultures

W. H. Lewis and M. R. Lewis. “Behavior of cross striated muscle in tissue cultures.” In:American Journal of Anatomy22.2 (1917), pp. 169–194

work page 1917
[10]

Quantitative analysis of the dex- amethasone side effect on human-derived young and aged skeletal muscle by myotube and nuclei segmentation using deep learning

S. Park et al. “Quantitative analysis of the dex- amethasone side effect on human-derived young and aged skeletal muscle by myotube and nuclei segmentation using deep learning.” In:Bioinfor- matics41.1 (2024), btae658

work page 2024
[11]

MyoFInDer: an AI-based tool for myotube fusion index determination

A. Weisrock et al. “MyoFInDer: an AI-based tool for myotube fusion index determination.” In:Tissue Engineering Part A30.19-20 (2024), pp. 652–661

work page 2024
[12]

ConvNeXt V2: Co-designing and scaling convnets with masked autoencoders

S. Woo et al. “ConvNeXt V2: Co-designing and scaling convnets with masked autoencoders.” In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023, pp. 16133–16142

work page 2023
[13]

Aberrant evoked calcium signaling and nAChR cluster morphology in a SOD1 D90A hiPSC-derived neuromuscular model

N. Couturier et al. “Aberrant evoked calcium signaling and nAChR cluster morphology in a SOD1 D90A hiPSC-derived neuromuscular model.” In:Frontiers in Cell and Developmental Biology12 (2024), p. 1429759

work page 2024
[14]

Bayesian Optimization for De- sign Parameters of 3D Image Data Analysis

D. Exler et al. “Bayesian Optimization for De- sign Parameters of 3D Image Data Analysis.” In: arXiv preprint arXiv:2602.15660(2026). 4

work page arXiv 2026

[1] [1]

Cellpose-SAM: superhuman generalization for cellular segmentation

M. Pachitariu, M. Rariden, and C. Stringer. “Cellpose-SAM: superhuman generalization for cellular segmentation.” In:bioRxiv(2025)

work page 2025

[2] [2]

Accurate and versatile 3D seg- mentation of plant tissues at cellular resolution

A. Wolny et al. “Accurate and versatile 3D seg- mentation of plant tissues at cellular resolution.” In:eLife9 (2020), e57613

work page 2020

[3] [3]

Star-convex Polyhedra for 3D Object Detection and Segmentation in Mi- croscopy

M. Weigert et al. “Star-convex Polyhedra for 3D Object Detection and Segmentation in Mi- croscopy.” In:The IEEE Winter Conference on Applications of Computer Vision (WACV). 2020

work page 2020

[4] [4]

Unsupervised GAN epoch selection for biomedical data synthesis

M. Böhland et al. “Unsupervised GAN epoch selection for biomedical data synthesis.” In: Current Directions in Biomedical Engineering. V ol. 9. 1. De Gruyter. 2023, pp. 467–470

work page 2023

[5] [5]

Improving 3D deep learning seg- mentation with biophysically motivated cell syn- thesis

R. Bruch et al. “Improving 3D deep learning seg- mentation with biophysically motivated cell syn- thesis.” In:Communications Biology8.1 (2025), p. 43

work page 2025

[6] [6]

Synthesis of large scale 3D mi- croscopic images of 3D cell cultures for training and benchmarking

R. Bruch et al. “Synthesis of large scale 3D mi- croscopic images of 3D cell cultures for training and benchmarking.” In:PLOS ONE18.3 (2023), e0283828

work page 2023

[7] [7]

U- Net: Convolutional Networks for Biomedical Im- age Segmentation

O. Ronneberger, P. Fischer, and T. Brox. “U- Net: Convolutional Networks for Biomedical Im- age Segmentation.” In:Medical Image Comput- ing and Computer-Assisted Intervention. 2015, pp. 234–241

work page 2015

[8] [8]

S. F. Gilbert.Developmental biology. 10th ed. Sunderland, Massachusetts: Sinauer Associates, 2014

work page 2014

[9] [9]

Behavior of cross striated muscle in tissue cultures

W. H. Lewis and M. R. Lewis. “Behavior of cross striated muscle in tissue cultures.” In:American Journal of Anatomy22.2 (1917), pp. 169–194

work page 1917

[10] [10]

Quantitative analysis of the dex- amethasone side effect on human-derived young and aged skeletal muscle by myotube and nuclei segmentation using deep learning

S. Park et al. “Quantitative analysis of the dex- amethasone side effect on human-derived young and aged skeletal muscle by myotube and nuclei segmentation using deep learning.” In:Bioinfor- matics41.1 (2024), btae658

work page 2024

[11] [11]

MyoFInDer: an AI-based tool for myotube fusion index determination

A. Weisrock et al. “MyoFInDer: an AI-based tool for myotube fusion index determination.” In:Tissue Engineering Part A30.19-20 (2024), pp. 652–661

work page 2024

[12] [12]

ConvNeXt V2: Co-designing and scaling convnets with masked autoencoders

S. Woo et al. “ConvNeXt V2: Co-designing and scaling convnets with masked autoencoders.” In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023, pp. 16133–16142

work page 2023

[13] [13]

Aberrant evoked calcium signaling and nAChR cluster morphology in a SOD1 D90A hiPSC-derived neuromuscular model

N. Couturier et al. “Aberrant evoked calcium signaling and nAChR cluster morphology in a SOD1 D90A hiPSC-derived neuromuscular model.” In:Frontiers in Cell and Developmental Biology12 (2024), p. 1429759

work page 2024

[14] [14]

Bayesian Optimization for De- sign Parameters of 3D Image Data Analysis

D. Exler et al. “Bayesian Optimization for De- sign Parameters of 3D Image Data Analysis.” In: arXiv preprint arXiv:2602.15660(2026). 4

work page arXiv 2026