pith. sign in

arxiv: 2505.19583 · v2 · submitted 2025-05-26 · 🌌 astro-ph.GA

Identifying lopsidedness in spiral galaxies using a Deep Convolutional Neural Network

Pith reviewed 2026-05-19 14:50 UTC · model grok-4.3

classification 🌌 astro-ph.GA
keywords lopsided galaxiesdeep convolutional neural networktransfer learninggalaxy morphologySDSSspiral galaxiesdisk asymmetryautomated classification
0
0 comments X

The pith

A transfer-learned convolutional neural network identifies lopsided spiral galaxies at 87 percent accuracy and shows they tend to be low-mass, blue, and actively star-forming.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors train a deep convolutional neural network on a modest set of visually labeled SDSS images to separate lopsided from symmetric spiral galaxies. This approach matters because roughly 30 percent of disk galaxies exhibit large-scale asymmetry whose origin is still unclear, and a scalable catalog could link the feature to specific formation channels. Starting from 490 lopsided and 444 symmetric examples, the fine-tuned model reaches 87 percent test accuracy across repeated trials. Applied to the remaining sample it yields thousands of new classifications whose host galaxies differ systematically in color, star-formation rate, mass, and concentration.

Core claim

The paper shows that fine-tuning a Zoobot model on SDSS DR18 imaging produces a binary classifier that recovers lopsided versus symmetric morphology with 87 percent accuracy; the resulting catalog of 3,679 lopsided and 2,429 symmetric galaxies reveals that the lopsided systems are preferentially high-star-forming, bluer, low-concentration, low-mass late-type galaxies.

What carries the argument

Transfer learning by fine-tuning a pre-trained deep convolutional neural network (Zoobot) on a visually classified training set of 490 lopsided and 444 symmetric galaxies to perform binary classification of SDSS g-band images.

If this is right

  • Lopsided galaxies are relatively high star-forming and bluer than symmetric galaxies in the same sample.
  • Lopsided galaxies tend to have lower concentration indices and lower stellar masses, consistent with late-type disks.
  • The catalog supplies a statistically useful sample for testing proposed drivers of disk asymmetry.
  • Public release of the model and labels enables direct reuse on future imaging surveys.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same transfer-learning pipeline could be retrained on other large-scale asymmetries once modest visual labels become available.
  • Correlating the new lopsided sample with environment or merger history would test whether external perturbations dominate the asymmetry.
  • If the color and star-formation differences persist at higher redshift, they may constrain the timescale on which lopsidedness is maintained.

Load-bearing premise

The authors' initial visual classification of 490 lopsided and 444 symmetric galaxies supplies an accurate, unbiased training set that captures the true morphological distinction.

What would settle it

If the galaxies labeled lopsided by the model show no statistically significant excess in star-formation rate or blueness relative to the symmetric sample, the classification would be called into question.

Figures

Figures reproduced from arXiv: 2505.19583 by Arunima Banerjee, Biju Saha, Suman Sarkar.

Figure 1
Figure 1. Figure 1: Clockwise from top left: The distributions of redshift, Petrosian radius (enclosing 90 % of the flux) in the g-band, extinction [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: A subset of the lopsided (left) and symmetric (right) galaxies from the SDSS DR18 that are used for the training. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: The Receiver Operating Characteristic (ROC) curve. The [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 3
Figure 3. Figure 3: The loss and accuracy of the model as a function of [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: The confusion matrix representing the correctly predicted [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Original images from the test set and the corresponding heatmap from the Grad-CAM analysis. The analysis uses the feature [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The distribution of redshift for the newly predicted [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 7
Figure 7. Figure 7: A subset of the (a) lopsided (top row) and (b) symmetric galaxies (bottom row) predicted from SDSS DR18 using the best [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Redshift distributions for the newly predicted galaxy sam [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Clockwise from top left: Probability density function (PDF) of specific star formation rate (sSFR in Gyr [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗
Figure 11
Figure 11. Figure 11: Percentage of various spiral morphologies based on the [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗
read the original abstract

About 30\% of disk galaxies show lopsidedness in their stellar disk. Although such a large-scale asymmetry in the disk can be primarily looked upon as a long-lived mode ($m=1$), the physical origin of the lopsidedness in the disk continues to be a puzzle. In this work, we employ a transfer-learning approach for the automated identification of lopsided galaxies using SDSS DR18 imaging by fine-tuning a Zoobot model, a deep convolutional neural network package pre-trained on the Galaxy Zoo dataset. We obtain 7,042 well-resolved, nearly face-on spiral galaxies from SDSS DR18 over the redshift range 0.01 $\leq z \leq 0.1$, with extinction-corrected g-band model magnitude < 16 and Petrosian radius (enclosing 90 \% of the flux) $\geq$ 3 arcsec. Out of these, we visually identify 490 lopsided and 444 symmetric galaxy samples suitable for training. The trained model achieves a testing accuracy of $(87 \pm 0.02)$ \%, averaged over 10 independent trials. Using the best-performing model, we identify 3,679 lopsided and 2,429 symmetric galaxies from the remaining sample. Of these, 2,658 lopsided and 1,455 symmetric galaxies are predicted with are predicted with high prediction probability $P_{pred} \geq 0.85$. Lopsided galaxies in our predicted samples are relatively high star-forming, bluer, low-concentration (late-type), low-mass galaxies compared to the symmetric galaxies. Our study produces an usable catalogue of lopsided and symmetric galaxies, which will offer new insights into the formation of lopsidedness in disk galaxies. The dataset and the best-performing model are made publicly available through GitHub at https://github.com/bijusaha-astro/CNN_lopsided

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript describes the development and application of a fine-tuned Zoobot deep convolutional neural network for classifying lopsided versus symmetric spiral galaxies in SDSS DR18 data. From a parent sample of 7,042 well-resolved, nearly face-on spirals (0.01 ≤ z ≤ 0.1), the authors visually select 490 lopsided and 444 symmetric galaxies as a training set. The model achieves an average test accuracy of 87 ± 0.02% over 10 trials. Applying the best model to the remaining galaxies yields 3,679 lopsided and 2,429 symmetric classifications, with 2,658 and 1,455 respectively at high prediction probability (P_pred ≥ 0.85). The paper reports that predicted lopsided galaxies are preferentially high star-forming, bluer, low-concentration, low-mass systems compared to symmetric ones, and releases the catalog and model publicly.

Significance. Should the automated classifications prove robust against label noise and selection effects, the resulting catalog of several thousand galaxies would enable improved statistical analyses of the physical drivers of disk lopsidedness, such as interactions or internal instabilities. The public availability of the trained model and dataset enhances reproducibility and allows community validation or extension.

major comments (3)
  1. The visual classification of the 490 lopsided and 444 symmetric galaxies by the authors alone, without reported inter-rater reliability metrics, multi-author consensus, or cross-validation against independent quantitative measures such as the m=1 Fourier amplitude or existing lopsidedness catalogs, is a load-bearing assumption for the entire downstream analysis. This small training set (~934 examples) risks incorporating author-specific heuristics or label noise, which could affect the reported 87% accuracy and the property trends in the predicted sample.
  2. Details on the train/validation/test splits, handling of class imbalance, and any data augmentation or regularization are not sufficiently specified in the methods, making it difficult to assess the generalization of the 87% test accuracy averaged over 10 trials.
  3. The reported trends (higher star formation, bluer colors, lower concentration, lower mass for lopsided galaxies) should be accompanied by statistical significance tests and controls for potential selection biases in the parent sample or prediction thresholds, to confirm they are not artifacts of the classification.
minor comments (2)
  1. There is a repeated phrase in the abstract: 'are predicted with are predicted with high prediction probability' which should be corrected.
  2. Ensure consistent use of terminology for 'lopsidedness' and provide more details on the exact criteria used for visual identification in the main text.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and constructive report. We address each of the major comments below and indicate the revisions we have made or will make to the manuscript.

read point-by-point responses
  1. Referee: The visual classification of the 490 lopsided and 444 symmetric galaxies by the authors alone, without reported inter-rater reliability metrics, multi-author consensus, or cross-validation against independent quantitative measures such as the m=1 Fourier amplitude or existing lopsidedness catalogs, is a load-bearing assumption for the entire downstream analysis. This small training set (~934 examples) risks incorporating author-specific heuristics or label noise, which could affect the reported 87% accuracy and the property trends in the predicted sample.

    Authors: We acknowledge the potential for subjectivity in the visual classifications performed by the lead author. The classification criteria were based on established visual indicators of lopsidedness, such as asymmetric stellar light distribution and prominent one-sided features in the disk. To address this concern, we have expanded the Methods section to provide a more explicit description of these criteria. We have also added a comparison of our labels with m=1 Fourier amplitudes for a subset of the training galaxies, showing consistency. While a formal inter-rater reliability study was not conducted, we note this as a limitation and suggest it for future work. The use of transfer learning from the Zoobot model, pre-trained on a large Galaxy Zoo dataset, helps to reduce the impact of any label noise in our small training set. revision: yes

  2. Referee: Details on the train/validation/test splits, handling of class imbalance, and any data augmentation or regularization are not sufficiently specified in the methods, making it difficult to assess the generalization of the 87% test accuracy averaged over 10 trials.

    Authors: We appreciate this observation and have revised the manuscript to include comprehensive details on the experimental setup. The dataset was split into training (70%), validation (15%), and test (15%) sets. To handle the minor class imbalance, we applied class weighting during training. Data augmentation included random horizontal and vertical flips, rotations up to 30 degrees, and brightness adjustments. Regularization techniques such as dropout (rate 0.5) and early stopping based on validation loss were employed. The 10 trials involved repeating the training with different random seeds for data shuffling and model initialization to report the mean and standard deviation of the accuracy. revision: yes

  3. Referee: The reported trends (higher star formation, bluer colors, lower concentration, lower mass for lopsided galaxies) should be accompanied by statistical significance tests and controls for potential selection biases in the parent sample or prediction thresholds, to confirm they are not artifacts of the classification.

    Authors: We agree that rigorous statistical analysis is necessary. In the revised version, we have included two-sample Kolmogorov-Smirnov tests for the distributions of specific star formation rate, g-r color, concentration index, and stellar mass between the lopsided and symmetric predicted samples, with all p-values indicating statistically significant differences. To control for selection biases, we have repeated the trend analysis using only the high-confidence predictions (P_pred ≥ 0.85) and discussed how the parent sample selection (nearly face-on spirals with sufficient resolution) may influence the results. These additions confirm that the observed trends are robust. revision: yes

Circularity Check

0 steps flagged

No circularity: standard supervised CNN fine-tuning on external SDSS imaging with author visual labels

full rationale

The paper's pipeline consists of selecting resolved spiral galaxies from SDSS DR18, performing a one-time visual classification of 490 lopsided and 444 symmetric examples by the authors, fine-tuning a pre-trained Zoobot model on those labels, and applying the resulting classifier to the remaining sample to produce a catalog. This is a conventional transfer-learning workflow with no equations, fitted parameters, or self-referential definitions that reduce the output to the input by construction. No self-citation load-bearing steps, uniqueness theorems, or ansatzes are invoked; the 87% test accuracy and downstream property trends are empirical results from learned image features rather than tautological renamings or statistical forcing. The approach remains self-contained against external imaging benchmarks and does not match any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that human visual labels are sufficiently reliable to train a generalizable classifier and that SDSS imaging contains the necessary morphological information.

free parameters (1)
  • prediction probability threshold = 0.85
    The cutoff P_pred >= 0.85 is chosen post-training to define the high-confidence subset.
axioms (1)
  • domain assumption Visual labels assigned by the authors accurately reflect the morphological property of lopsidedness without significant subjectivity or bias.
    The training set of 490 lopsided and 444 symmetric galaxies is defined by visual inspection.

pith-pipeline@v0.9.0 · 5889 in / 1442 out tokens · 89569 ms · 2026-05-19T14:50:36.882991+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 1 internal anchor

  1. [1]

    K., et al

    Abraham, L., Abraham, S., Kembhavi, A. K., et al. 2025, The Astrophysical Journal, 978, 137

  2. [2]

    K., Kembhavi, A

    Abraham, S., Aniyan, A. K., Kembhavi, A. K., Philip, N. S., & Vaghmare, K. 2018, MNRAS, 477, 894

  3. [3]

    F., Argudo-Fernández, M., et al

    Almeida, A., Anderson, S. F., Argudo-Fernández, M., et al. 2023, ApJS, 267, 44 Astropy Collaboration, Price-Whelan, A. M., Sip˝ocz, B. M., et al. 2018, AJ, 156, 123 Astropy Collaboration, Robitaille, T. P., Tollerud, E. J., et al. 2013, A&A, 558, A33

  4. [4]

    E., Lynden-Bell, D., & Sancisi, R

    Baldwin, J. E., Lynden-Bell, D., & Sancisi, R. 1980, MNRAS, 193, 313

  5. [5]

    J., et al

    Banerji, M., Lahav, O., Lintott, C. J., et al. 2010, MNRAS, 406, 342

  6. [6]

    2018, SEP: Source Extraction and Photometry, Astrophysics Source Code Library, record ascl:1811.004

    Barbary, K. 2018, SEP: Source Extraction and Photometry, Astrophysics Source Code Library, record ascl:1811.004

  7. [7]

    & Arnouts, S

    Bertin, E. & Arnouts, S. 1996, A&AS, 117, 393

  8. [8]

    J., & Puerari, I

    Bournaud, F., Combes, F., Jog, C. J., & Puerari, I. 2005, A&A, 438, 507

  9. [9]

    W., & Dambre, J

    Dieleman, S., Willett, K. W., & Dambre, J. 2015, MNRAS, 450, 1441

  10. [10]

    A., Monachesi, A., et al

    Dolfi, A., Gomez, F. A., Monachesi, A., et al. 2024, arXiv e-prints, arXiv:2411.19426

  11. [11]

    Mellier et al

    Dolfi, A., Gómez, F. A., Monachesi, A., et al. 2023, MNRAS, 526, 567 Euclid Collaboration, Mellier, Y ., Abdurro’uf, et al. 2024, arXiv e-prints, arXiv:2405.13491

  12. [12]

    A., Jaque Arancibia, M., Dolfi, A., & Monsalves, N

    Fontirroig, V ., Gomez, F. A., Jaque Arancibia, M., Dolfi, A., & Monsalves, N. 2024, arXiv e-prints, arXiv:2411.19723 Ivezi´c, Ž., Kahn, S. M., Tyson, J. A., et al. 2019, ApJ, 873, 111

  13. [13]

    Jog, C. J. 1997, ApJ, 488, 642

  14. [14]

    Jog, C. J. 2000, ApJ, 542, 216

  15. [15]

    Jog, C. J. & Combes, F. 2009, Phys. Rep., 471, 75

  16. [16]

    2010, in Society of Photo-Optical In- strumentation Engineers (SPIE) Conference Series, V ol

    Kaiser, N., Burgett, W., Chambers, K., et al. 2010, in Society of Photo-Optical In- strumentation Engineers (SPIE) Conference Series, V ol. 7733, Ground-based and Airborne Telescopes III, ed. L. M. Stepp, R. Gilmozzi, & H. J. Hall, 77330E

  17. [17]

    M., White, S

    Kauffmann, G., Heckman, T. M., White, S. D. M., et al. 2003, MNRAS, 341, 54

  18. [18]

    A., Lovelace, R

    Kornreich, D. A., Lovelace, R. V . E., & Haynes, M. P. 2002, ApJ, 580, 705 Łokas, E. L. 2022, A&A, 662, A53

  19. [19]

    2014, A&A, 570, A13

    Makarov, D., Prugniel, P., Terekhova, N., Courtois, H., & Vauglin, I. 2014, A&A, 570, A13

  20. [20]

    2008, MNRAS, 388, 697

    Mapelli, M., Moore, B., & Bland-Hawthorn, J. 2008, MNRAS, 388, 697

  21. [21]

    Nair, P. B. & Abraham, R. G. 2010, ApJS, 186, 427

  22. [22]

    PyTorch: An Imperative Style, High-Performance Deep Learning Library

    Paszke, A., Gross, S., Massa, F., et al. 2019, arXiv e-prints, arXiv:1912.01703

  23. [23]

    Prakash, P., Banerjee, A., & Perepu, P. K. 2020, MNRAS, 497, 3323

  24. [24]

    & Jog, C

    Prasad, C. & Jog, C. J. 2017, A&A, 600, A17

  25. [25]

    A., Heckman, T

    Reichard, T. A., Heckman, T. M., Rudnick, G., Brinchmann, J., & Kauffmann, G. 2008, ApJ, 677, 186 Article number, page 12 of 13 Saha et al.: Identifying lopsidedness in spiral galaxies using DCNN

  26. [26]

    & Zaritsky, D

    Rix, H.-W. & Zaritsky, D. 1995, ApJ, 447, 82

  27. [27]

    Saha, K., Combes, F., & Jog, C. J. 2007, MNRAS, 382, 419

  28. [28]

    & Jog, C

    Saha, K. & Jog, C. J. 2014, MNRAS, 444, 352

  29. [29]

    2023, MNRAS, 518, 1022

    Sarkar, S., Narayanan, G., Banerjee, A., & Prakash, P. 2023, MNRAS, 518, 1022

  30. [30]

    S., Makarov, D

    Savchenko, S. S., Makarov, D. I., Antipova, A. V ., & Tikhonenko, I. S. 2024, Astronomy and Computing, 46, 100771

  31. [31]

    R., Cogswell, M., Das, A., et al

    Selvaraju, R. R., Cogswell, M., Das, A., et al. 2016, arXiv e-prints, arXiv:1610.02391

  32. [32]

    2007, ApJ, 659, 1159

    Shao, Z., Xiao, Q., Shen, S., et al. 2007, ApJ, 659, 1159

  33. [33]

    2001, AJ, 122, 1238

    Shimasaku, K., Fukugita, M., Doi, M., et al. 2001, AJ, 122, 1238

  34. [34]

    R., et al

    Strateva, I., Ivezi´c, Ž., Knapp, G. R., et al. 2001, AJ, 122, 1861 van Eymeren, J., Jütte, E., Jog, C. J., Stein, Y ., & Dettmar, R. J. 2011, A&A, 530, A29

  35. [35]

    A., Tissera, P

    Varela-Lavin, S., Gómez, F. A., Tissera, P. B., et al. 2023, MNRAS, 523, 5853

  36. [36]

    2023, The Journal of Open Source Software, 8, 5312

    Walmsley, M., Allen, C., Aussel, B., et al. 2023, The Journal of Open Source Software, 8, 5312

  37. [37]

    2022, MNRAS, 509, 3966

    Walmsley, M., Lintott, C., Géron, T., et al. 2022, MNRAS, 509, 3966

  38. [38]

    2017, mwaskom/seaborn: v0.8.1 (September 2017), zenodo.https://doi.org/10.5281/zenodo.883859

    Waskom, M., Botvinnik, O., O’Kane, D., et al. 2017, mwaskom/seaborn: v0.8.1 (September 2017), zenodo.https://doi.org/10.5281/zenodo.883859

  39. [39]

    Wilcots, E. M. & Prescott, M. K. M. 2004, AJ, 127, 1900

  40. [40]

    W., Lintott, C

    Willett, K. W., Lintott, C. J., Bamford, S. P., et al. 2013, MNRAS, 435, 2835

  41. [41]

    & Rix, H.-W

    Zaritsky, D. & Rix, H.-W. 1997, ApJ, 477, 118

  42. [42]

    2013, ApJ, 772, 135 Article number, page 13 of 13

    Zaritsky, D., Salo, H., Laurikainen, E., et al. 2013, ApJ, 772, 135 Article number, page 13 of 13