pith. sign in

arxiv: 2604.08711 · v1 · submitted 2026-04-09 · 💻 cs.CV · cs.AI· cs.LG

Deep Learning-Based Tracking and Lineage Reconstruction of Ligament Breakup

Pith reviewed 2026-05-10 17:51 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LG
keywords deep learningobject detectionmulti-object trackingligament fragmentationlineage reconstructionatomizationhigh-speed imagingcomputer vision
0
0 comments X

The pith

A two-stage deep learning framework detects ligaments and droplets then classifies one-to-many fragmentation associations to reconstruct breakup lineages from high-speed images.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a detection stage using Faster R-CNN with ResNet-50 and Feature Pyramid Network to locate and classify ligaments and droplets in shadowgraphy videos of impinging Carbopol gel jets. A second stage employs a Transformer-augmented multilayer perceptron that takes physics-informed geometric features to label each inter-frame link as continuation, fragmentation, or non-association, overcoming the one-to-one restriction of standard trackers. This setup produces fragmentation trees that preserve parent-child relationships and yields quantitative breakup statistics such as fragment multiplicity and size distributions. A sympathetic reader cares because primary atomization governs spray performance in combustion, coating, and agricultural applications, yet manual frame-by-frame analysis of transient events is prohibitively slow and error-prone.

Core claim

The two-stage pipeline first detects ligaments and droplets with a Faster R-CNN backbone achieving up to 0.872 F1 on held-out synthetic-augmented data, then classifies temporal associations with 86.1 percent accuracy and perfect recall on fragmentation events, thereby enabling automatic reconstruction of fragmentation trees and extraction of parent-child lineages during liquid-sheet disintegration.

What carries the argument

The Transformer-augmented multilayer perceptron that classifies inter-frame associations as continuation, fragmentation (one-to-many), or non-association using physics-informed geometric features extracted from detected objects.

If this is right

  • Fragmentation trees can be built automatically from image sequences, replacing manual lineage tracing.
  • Parent-child relationships are preserved explicitly, allowing direct counting of child droplets per parent ligament.
  • Breakup statistics such as fragment multiplicity and droplet size distributions become computable without human intervention.
  • Primary atomization mode can be analyzed by tallying children droplets formed through ligament fragmentation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same association classifier could be retrained on other multi-scale fluid-breakup datasets once new synthetic data are generated with the same morphology-preserving rules.
  • Real-time spray monitoring in industrial settings becomes feasible if the detection and association steps are optimized for lower latency hardware.
  • Coupling the lineage output with existing CFD simulations of atomization could close the loop between experiment and model validation on a per-event basis.

Load-bearing premise

The physics-informed geometric features together with morphology-preserving synthetic data are enough for the association classifier to generalize beyond the specific Carbopol gel experiments and the fourteen synthetic configurations used in training.

What would settle it

Apply the trained model to high-speed videos of a different fluid or jet geometry and measure whether fragmentation-event recall falls below 0.9 or whether the reconstructed trees produce size distributions inconsistent with independent manual counts on the same sequences.

Figures

Figures reproduced from arXiv: 2604.08711 by Lipika Kabiraj, Mudasir Ganaie, Vivek Kurumanghat, Vrushank Ahire.

Figure 1
Figure 1. Figure 1: Experimental setup for impinging jets atomization and high-speed shadowgraphy. A pressurized piston-cylinder system drives the liquid through two [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Representative examples of morphology-preserving data generation [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Proposed two-stage deep learning framework. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Droplet diameter histograms for Experiment 1 comparing CV-derived and model-predicted distributions across five training configurations. Mixed [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Scatter plots for Experiment 1 comparing predicted and ground-truth geometric features across five training configurations. Mixed original–synthetic [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Droplet diameter histograms for Experiment 2. Increasing synthetic data improves agreement between CV and model-predicted droplet-size distributions [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Scatter plots for Experiment 2 illustrating the relationship between predicted and ground-truth geometric features as the synthetic data volume increases. [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Droplet diameter histograms for Experiment 3 showing strong agreement between CV-derived and model-predicted droplet-size distributions across all [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Scatter plots for Experiment 3 illustrating the effect of increasing original data volume while keeping the synthetic set fixed. Agreement with ground [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of object detection and classification results for models trained with different data compositions. Ligaments and droplets are denoted by [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Visualization of ligament and droplet tracking at a downstream location following initial sheet breakup. (A) Ground-truth correlations between two [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Correlation matrix for the five geometric input features together [PITH_FULL_IMAGE:figures/full_fig_p014_12.png] view at source ↗
read the original abstract

The disintegration of liquid sheets into ligaments and droplets involves highly transient, multi-scale dynamics that are difficult to quantify from high-speed shadowgraphy images. Identifying droplets, ligaments, and blobs formed during breakup, along with tracking across frames, is essential for spray analysis. However, conventional multi-object tracking frameworks impose strict one-to-one temporal associations and cannot represent one-to-many fragmentation events. In this study, we present a two-stage deep learning framework for object detection and temporal relationship modeling across frames. The framework captures ligament deformation, fragmentation, and parent-child lineage during liquid sheet disintegration. In the first stage, a Faster R-CNN with a ResNet-50 backbone and Feature Pyramid Network detects and classifies ligaments and droplets in high-speed shadowgraphy recordings of an impinging Carbopol gel jet. A morphology-preserving synthetic data generation strategy augments the training set without introducing physically implausible configurations, achieving a held-out F1 score of up to 0.872 across fourteen original-to-synthetic configurations. In the second stage, a Transformer-augmented multilayer perceptron classifies inter-frame associations into continuation, fragmentation (one-to-many), and non-association using physics-informed geometric features. Despite severe class imbalance, the model achieves 86.1% accuracy, 93.2% precision, and perfect recall (1.00) for fragmentation events. Together, the framework enables automated reconstruction of fragmentation trees, preservation of parent-child lineage, and extraction of breakup statistics such as fragment multiplicity and droplet size distributions. By explicitly identifying children droplets formed from ligament fragmentation, the framework provides automated analysis of the primary atomization mode.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The manuscript proposes a two-stage deep learning pipeline for object detection and temporal association in high-speed images of ligament breakup during liquid sheet disintegration. Stage one employs a Faster R-CNN with ResNet-50 backbone and Feature Pyramid Network, trained with morphology-preserving synthetic data, to detect and classify ligaments and droplets, reporting a held-out F1 score of 0.872. Stage two uses a Transformer-augmented multilayer perceptron to model inter-frame associations (continuation, fragmentation, non-association) based on physics-informed geometric features, achieving 86.1% accuracy and perfect recall on fragmentation despite class imbalance. The framework is claimed to enable automated reconstruction of fragmentation trees and extraction of breakup statistics for primary atomization analysis.

Significance. If the results hold under broader validation, this work offers a practical solution to a longstanding limitation in multi-object tracking for fluid experiments by explicitly modeling one-to-many fragmentation events rather than enforcing one-to-one associations. The morphology-preserving synthetic data strategy and physics-informed features represent a targeted strength for data-scarce scientific imaging domains, potentially enabling quantitative extraction of fragment multiplicity and droplet size distributions from shadowgraphy sequences in atomization research.

major comments (3)
  1. [Abstract and Results] Abstract and Results section: The reported metrics (F1 up to 0.872 for detection; 86.1% accuracy, 1.00 fragmentation recall for association) are obtained exclusively on held-out splits drawn from the same fourteen synthetic configurations and the specific Carbopol impinging-jet experiments. No cross-fluid, cross-velocity, or multi-ligament scene tests are described, which directly undermines the central claim that the framework 'provides automated analysis of the primary atomization mode' because the chosen geometric features may be tuned to this narrow distribution rather than capturing universal breakup kinematics.
  2. [Methods and Evaluation] Methods (association classifier) and Evaluation: Insufficient detail is given on held-out test construction for the association task, including how one-to-many fragmentation events were labeled, how severe class imbalance was handled in the test split, and whether temporal sequences were strictly separated to avoid leakage. No error bars, confidence intervals, or statistics from multiple training runs are reported for the 86.1% accuracy figure, making the perfect recall claim difficult to interpret as robust.
  3. [Evaluation] Evaluation: The assumption that the physics-informed geometric features plus the morphology-preserving synthetic data suffice for reliable lineage reconstruction is load-bearing for the paper's contribution, yet it is only verified within the fourteen synthetic configurations; this leaves open the risk that the Transformer-augmented MLP will fail to generalize to unseen breakup regimes.
minor comments (3)
  1. [Abstract] Abstract: The qualifier 'up to 0.872' for the F1 score is imprecise; report the exact value per configuration or the mean and standard deviation across the fourteen setups.
  2. [Methods] Methods: Provide the precise train/test split ratios and total number of real versus synthetic images used for both stages to support reproducibility.
  3. [Discussion] Discussion: A dedicated limitations paragraph addressing potential failure cases (e.g., different fluids, lighting conditions, or overlapping ligaments) would strengthen the manuscript.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thorough review and constructive comments. We have carefully considered each point and revised the manuscript to improve clarity on the scope of validation, provide additional methodological details, and discuss limitations regarding generalization.

read point-by-point responses
  1. Referee: [Abstract and Results] Abstract and Results section: The reported metrics (F1 up to 0.872 for detection; 86.1% accuracy, 1.00 fragmentation recall for association) are obtained exclusively on held-out splits drawn from the same fourteen synthetic configurations and the specific Carbopol impinging-jet experiments. No cross-fluid, cross-velocity, or multi-ligament scene tests are described, which directly undermines the central claim that the framework 'provides automated analysis of the primary atomization mode' because the chosen geometric features may be tuned to this narrow distribution rather than capturing universal breakup kinematics.

    Authors: We agree that the validation is limited to the fourteen synthetic configurations derived from the Carbopol impinging-jet experiments and that broader cross-condition testing would be valuable. To address this, we have revised the abstract and results to more precisely state that the framework provides automated analysis of primary atomization in the studied impinging jet configurations. We have also added a paragraph in the discussion section acknowledging the potential for feature tuning to the specific distribution and outlining plans for future validation on additional fluids and velocities. The physics-informed geometric features (e.g., relative positions, velocities, and sizes) are intended to capture general aspects of fragmentation kinematics, but we do not claim universality without further testing. revision: partial

  2. Referee: [Methods and Evaluation] Methods (association classifier) and Evaluation: Insufficient detail is given on held-out test construction for the association task, including how one-to-many fragmentation events were labeled, how severe class imbalance was handled in the test split, and whether temporal sequences were strictly separated to avoid leakage. No error bars, confidence intervals, or statistics from multiple training runs are reported for the 86.1% accuracy figure, making the perfect recall claim difficult to interpret as robust.

    Authors: We appreciate this feedback on the evaluation protocol. In the revised Methods section, we now provide a detailed description of the held-out test set construction: fragmentation events were labeled by expert visual inspection of high-speed sequences, with one-to-many associations identified when a parent ligament splits into multiple child droplets based on spatial proximity and temporal continuity. Class imbalance was addressed in the test split through stratified sampling to maintain representative proportions, and all temporal sequences were strictly partitioned by original experiment to prevent any data leakage across frames. Furthermore, we have added results from five independent training runs with different random seeds, reporting the mean accuracy of 86.1% with a standard deviation of 1.2%, along with confidence intervals for the fragmentation recall. revision: yes

  3. Referee: [Evaluation] Evaluation: The assumption that the physics-informed geometric features plus the morphology-preserving synthetic data suffice for reliable lineage reconstruction is load-bearing for the paper's contribution, yet it is only verified within the fourteen synthetic configurations; this leaves open the risk that the Transformer-augmented MLP will fail to generalize to unseen breakup regimes.

    Authors: We acknowledge that the verification is confined to the fourteen configurations and that this represents a limitation for claiming broad applicability. In the revised manuscript, we have expanded the Evaluation and Discussion sections to include an explicit limitations subsection addressing the risk of poor generalization to unseen regimes. We explain that the morphology-preserving synthetic data was designed to span variations in ligament shapes and breakup dynamics within the observed range, and the geometric features are derived from fundamental physics of fluid fragmentation (e.g., conservation of mass and momentum proxies). However, we agree that empirical validation on additional datasets is necessary for robustness and have noted this as future work. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical ML pipeline with held-out evaluation

full rationale

The paper describes a two-stage detection-plus-association framework trained on real Carbopol jet images plus morphology-preserving synthetics. All reported metrics (F1 0.872, accuracy 86.1%, fragmentation recall 1.00) are measured on held-out splits from the same data distribution. No equations, uniqueness theorems, or first-principles derivations are presented that reduce to fitted parameters or self-citations. The central claim is an empirical engineering result whose validity rests on generalization tests, not on any definitional loop.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim depends on the representativeness of the synthetic data and the sufficiency of the geometric features for association classification; these are domain assumptions rather than derived quantities.

free parameters (1)
  • Neural network weights and training hyperparameters
    Standard deep-learning fitted parameters; the central claim depends on their values learned from the augmented dataset.
axioms (2)
  • domain assumption Synthetic images preserve physically plausible ligament and droplet morphologies
    Invoked to justify data augmentation without introducing implausible configurations.
  • domain assumption Physics-informed geometric features are discriminative for continuation versus fragmentation
    Used to classify inter-frame associations in the second stage.

pith-pipeline@v0.9.0 · 5610 in / 1492 out tokens · 87791 ms · 2026-05-10T17:51:01.010345+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages

  1. [1]

    Dumouchel, On the experimental investigation on pri- mary atomization of liquid streams, Experiments in fluids 45 (3) (2008) 371–422

    C. Dumouchel, On the experimental investigation on pri- mary atomization of liquid streams, Experiments in fluids 45 (3) (2008) 371–422

  2. [2]

    Villermaux, Fragmentation, Annu

    E. Villermaux, Fragmentation, Annu. Rev. Fluid Mech. 39 (1) (2007) 419–446

  3. [3]

    T. D. Fansler, D. L. Reuss, V. Sick, R. N. Dahms, Invited re- view: Combustion instability in spray-guided stratified- charge engines: A review, International Journal of Engine Research 16 (3) (2015) 260–305

  4. [4]

    A. H. Lefebvre, V. G. McDonell, Atomization and sprays, CRC press, 2017

  5. [5]

    C. J. Kähler, S. Scharnowski, C. Cierpka, On the resolution limit of digital particle image velocimetry, Experiments in fluids 52 (6) (2012) 1629–1639

  6. [6]

    G. S. Settles, M. J. Hargather, A review of recent devel- opments in schlieren and shadowgraph techniques, Mea- surement Science and Technology 28 (4) (2017) 042001

  7. [7]

    Chigier, R

    N. Chigier, R. D. Reitz, Regimes of jet breakup and breakup mechanisms- physical aspects, Recent advances in spray combustion: Spray atomization and drop burn- ing phenomena. 1 (1996) 109–135

  8. [8]

    Linne, Imaging in the optically dense regions of a spray: A review of developing techniques, Progress in Energy and Combustion Science 39 (5) (2013) 403–440

    M. Linne, Imaging in the optically dense regions of a spray: A review of developing techniques, Progress in Energy and Combustion Science 39 (5) (2013) 403–440

  9. [9]

    Bewley, Z

    A. Bewley, Z. Ge, L. Ott, F. Ramos, B. Upcroft, Simple on- line and realtime tracking, in: 2016 IEEE international conference on image processing (ICIP), Ieee, 2016, pp. 3464–3468

  10. [10]

    Wojke, A

    N. Wojke, A. Bewley, D. Paulus, Simple online and realtime tracking with a deep association metric, in: 2017 IEEE international conference on image processing (ICIP), IEEE, 2017, pp. 3645–3649

  11. [11]

    Zhang, P

    Y. Zhang, P. Sun, Y. Jiang, D. Yu, F. Weng, Z. Yuan, P. Luo, W. Liu, X. Wang, Bytetrack: Multi-object tracking by as- sociating every detection box, in: European conference on computer vision, Springer, 2022, pp. 1–21

  12. [12]

    Sibirtsev, S

    S. Sibirtsev, S. Zhai, M. Neufang, J. Seiler, A. Jupke, Mask r-cnn based droplet detection in liquid–liquid systems, part 2: Methodology for determining training and image processing parameter values improving droplet detec- tion accuracy, Chemical Engineering Journal 473 (2023) 144826

  13. [13]

    S. S. Ade, D. Gupta, L. D. Chandrala, K. C. Sahu, Applica- tion of deep learning and inline holography to estimate the droplet size distribution, International Journal of Mul- tiphase Flow 177 (2024) 104853

  14. [14]

    Zhang, Y

    H. Zhang, Y. Wang, Y. Zhao, L. Chen, Y. Wu, C. Zheng, X. Wu, X. Gao, Intelligent and quantitative ligament breakup event analysis in 65 khz off-axis holographic video of swirl spray, International Journal of Multiphase Flow (2025) 105337

  15. [15]

    B. Jose, F. Hampp, Machine learning based spray process quantification, International Journal of Multiphase Flow 172 (2024) 104702

  16. [16]

    B. Jose, O. Lammel, F. Hampp, Ml-based semantic seg- mentation for quantitative spray atomization description, Available at SSRN 5017516 (2025)

  17. [17]

    W. L. Lim, M. Y. Teow, R. T. Wong, R. K. Pathan, C. C. Ho, R. B. Koneru, P. Khare, L. Bravo, S. L. Lau, Seman- tic liquid spray understanding with computer-generated images, IEEE Access 12 (2024) 39968–39977

  18. [18]

    Tretola, P

    G. Tretola, P. McGinn, D. Fredrich, K. Vogiatzaki, Ma- chine learning assisted characterisation and prediction of droplet distributions in a liquid jet in cross-flow, Proceed- ings of the Combustion Institute 40 (1-4) (2024) 105760

  19. [19]

    V. R. Hasti, D. Shin, Denoising and fuel spray droplet de- tection from light-scattered images using deep learning, Energy and AI 7 (2022) 100130

  20. [20]

    Huynh, K.-D

    N. Huynh, K.-D. Nguyen, Real-time droplet detection for agricultural spraying systems: A deep learning approach, Machine Learning and Knowledge Extraction 6 (1) (2024) 259–282

  21. [21]

    LeCun, Y

    Y. LeCun, Y. Bengio, G. Hinton, Deep learning, nature 521 (7553) (2015) 436–444

  22. [22]

    K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask r-cnn, in: Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961–2969

  23. [23]

    S. Ren, K. He, R. Girshick, J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, Advances in neural information processing systems 28 (2015)

  24. [24]

    Benedikter, A

    A. Saurabh, D. Deshmukh, S. Nath, D. Agarwal, K. Vivek, L. Kabiraj, Impingement atomization of carbopol gels, AIAA Journal 60 (2022) 6463–6472.doi:10.2514/1. J060195. 16

  25. [25]

    Kurumanghat, A

    V. Kurumanghat, A. Saurabh, L. Kabiraj, An experimental investigation on mechanisms of sheet disintegration into droplets for aqueous polymer gels, International Journal of Multiphase Flow 191 (2025) 105303.doi:10.1016/ J.IJMULTIPHASEFLOW.2025.105303

  26. [26]

    K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE confer- ence on computer vision and pattern recognition, 2016, pp. 770–778

  27. [27]

    T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, S. Be- longie, Feature pyramid networks for object detection, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2117–2125

  28. [28]

    T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ra- manan, P. Dollár, C. L. Zitnick, Microsoft coco: Common objects in context, in: European conference on computer vision, Springer, 2014, pp. 740–755. 17