pith. sign in

arxiv: 2605.17445 · v1 · pith:BK3ANGRXnew · submitted 2026-05-17 · 🌌 astro-ph.IM

Stellar Density Classification and Regression for CSST Multi-color Imaging Using Deep Learning

Pith reviewed 2026-05-19 22:41 UTC · model grok-4.3

classification 🌌 astro-ph.IM
keywords stellar density classificationdeep learningCSSTsource extractionResNetastronomical imagingregressionsurvey data processing
0
0 comments X

The pith

A two-stage deep learning model classifies CSST images into six stellar density levels and regresses bright star counts to adapt source extraction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to demonstrate that a hierarchical neural network can first sort multi-color images into six discrete stellar density categories and then estimate the count of bright stars in each field. This matters because the CSST survey will cover an enormous range of environments, from nearly empty extragalactic voids to the densely packed Galactic center, where a single fixed extraction pipeline produces large systematic errors. The first stage acts as a gate that selects the right processing settings for the second stage, which focuses on predicting the number of stars brighter than 23.5 magnitude. By separating density recognition from actual source finding, the approach aims to keep both photometry and astrometry accurate and uniform across the entire survey. If the method holds, the resulting catalogs should show fewer density-dependent biases than those produced by conventional algorithms.

Core claim

The authors claim that a ResNet-34 classifier achieves 98.83 percent global accuracy when assigning CSST images to one of six stellar density categories, while a following ResNet-50 regressor predicts the number of stars brighter than 23.5 magnitude with a mean absolute error of 0.0824 dex; together these steps allow the source-extraction pipeline to be matched to the local density environment and thereby reduce systematic uncertainties in both crowded and sparse regions.

What carries the argument

The central mechanism is the hierarchical two-stage model in which a classification network first assigns an image to a density category that then gates the application of a regression network predicting bright-star counts, thereby decoupling density characterization from downstream photometric and astrometric processing.

If this is right

  • Photometric and astrometric algorithms can be chosen or calibrated according to the classified density category.
  • Systematic errors that normally appear in both very crowded and very sparse fields are reduced by using density-specific settings.
  • Astrometric calibration benefits directly from the accurate prediction of bright reference stars.
  • The overall data products gain greater homogeneity across the survey's full dynamic range.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same staged architecture could be retrained on data from other wide-field surveys that also span large density contrasts.
  • Inserting the classifier into an online data-reduction pipeline might allow immediate selection of the appropriate extraction parameters during observation.
  • Extending the regression target to include additional field statistics such as average color or crowding index could further refine the adaptation.

Load-bearing premise

The six chosen density categories and the set of training images are taken to be representative of the full range of stellar densities that actual CSST observations will contain.

What would settle it

Running the trained model on a large sample of real CSST data spanning voids to the Galactic center and obtaining either global classification accuracy below 95 percent or a regression mean absolute error above 0.15 dex would falsify the reported performance.

Figures

Figures reproduced from arXiv: 2605.17445 by Chao Liu, Hao Tian, Jialu Nie, Jianjun Chen, Jinzhi Lai, Man I Lam, Ming Yang, Xiaohan Chen, Xin Zhang.

Figure 1
Figure 1. Figure 1: Schematic diagram of the main focal plane of the CSST survey camera. There are 18 black squares for detectors used in multi-color imaging observations (NUV, u, g, r, i, z, and y), and 12 orange squares for GU (255-400 nm), GV (400-620 nm) and GI (620-1000 nm) slitless spectra observations. 2.2. Data Simulation The data used in this work were generated using the CSST simulation software, developed to produc… view at source ↗
Figure 2
Figure 2. Figure 2: Visualization of the mapping between the raw CSST detector data and the network input. (a) The original high-resolution detector patch (9232 × 9216 pixels, 11.4 ′ × 11.4 ′ ). (b) The corresponding 224 × 224 feature map used as the input for the ResNet model. Both panels utilize a logarithmic stretch to enhance visibility. of our density estimation framework: a classification dataset for coarse density cate… view at source ↗
Figure 3
Figure 3. Figure 3: Representative samples of the six density classes. Each panel corresponds to the full observational area of one CSST detector chip (9232×9216 pixels), demonstrating the global density patterns used for classification. The orange label indicates the density class, ranging from 0 (empty field) to 5 (> 105 sources). 12 14 16 18 20 22 24 26 28 Mag 10 0 10 1 Number Density 0 12 14 16 18 20 22 24 26 28 Mag 10 2 … view at source ↗
Figure 4
Figure 4. Figure 4: The magnitude distribution at different densities in the classification dataset. Note that the density range represented by the label in Title is visible in Table2. tasks, we apply data augmentation techniques to the training sets, as described in Section 2.3. 3. MODELS Although a unified regression model could predict star counts across the full density range, we adopt a hier- [PITH_FULL_IMAGE:figures/fu… view at source ↗
Figure 5
Figure 5. Figure 5: Distribution of stellar sources, galaxy counts, and bright stars (mag < 23.5) in the regression dataset. The histograms show the comprehensive coverage of density variations, with particular emphasis on the bright star population essential for astrometric calibration [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The structure of the ResNet block. 3.2. Training process 3.2.1. Classification Model Training We trained the classification model using Cross￾Entropy loss and the AdamW optimizer. This setup effectively penalizes misclassifications by measuring the discrepancy between predicted probabilities and ground￾truth labels, thereby refining the decision boundaries across the six density classes. The AdamW optimize… view at source ↗
Figure 7
Figure 7. Figure 7: illustrates representative training curves show￾ing this convergence behavior for both classification and regression tasks [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Confusion Matrix of the Classification Model 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 True Value (Log Count) 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 Predicted Value (Log Count) R2 = 0.9076 RMSE = 0.1332 MAE = 0.0824 1 2 3 4 5 6 Point Density [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Actual versus predicted bright star counts for the regression model. The solid red line represents perfect predictions (y=x), while the dashed lines indicate the ±20% error margin. The color density reflects the concentration of data points, showing strong agreement across most of the dynamic range with some expected scatter at extreme values. estimation of available reference stars directly impacts positi… view at source ↗
Figure 10
Figure 10. Figure 10: Histogram of source density distribution for all classification test set samples (blue bars) with logarith￾mically spaced bins, overlaid with the distribution of mis￾classified samples (red hatched bars). The figure highlights that 50.3% of samples are concentrated near classification boundaries, with misclassified samples predominantly clus￾tering around the 1k-source threshold for Class 3. The discretiz… view at source ↗
Figure 11
Figure 11. Figure 11: Density distributions of representative samples: (a) Correctly classified Class 2 sample (966 sources), (b) Class 3 sample misclassified as Class 2 (1005 sources), and (c) Correctly classified Class 3 sample (1343 sources). The similarity between (a) and (b) illustrates the challenge of classifying samples near the 1k-source boundary. bright stars (<23.5 mag) essential for astromet￾ric calibration, achiev… view at source ↗
read the original abstract

The Chinese Space Station Survey Telescope (CSST) aims to map the universe across an unprecedented dynamic range of stellar densities, spanning from extragalactic voids to the crowded Galactic center (e.g. a few stars and galaxies in the voids and $>10^5$ stars per detector in Galactic center). However, processing such heterogeneous data with a general source extraction pipeline introduces significant systematic uncertainties, standard algorithms exhibit poor accuracy in crowded fields and suffer from increased astrometric uncertainty in void regions. To mitigate these systematics, we propose a hierarchical, two-stage deep learning model for adaptive data reduction. The first stage ('classification') employs a ResNet-34 model to classify images into six discrete density categories, achieving $98.83\%$ in global accuracy. This classification acts as a critical decision gate, ensuring high calibration accuracy in the crowded fields. In the second stage ('regression'), a ResNet-50 regression model predicts the bright stars ($<23.5$ mag) in the field, which is essential for astrometric calibration, achieving a mean absolute error (MAE) of 0.0824 dex. By decoupling density characterization from source extraction, our model ensures that photometric and astrometric algorithms are optimally matched to the stellar density environment, thereby enhancing the fidelity and homogeneity of CSST as well as future large sky survey data products.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript describes a hierarchical two-stage deep learning model for classifying CSST multi-color images into six stellar density categories using ResNet-34 (98.83% global accuracy) and regressing the number of bright stars using ResNet-50 (MAE 0.0824 dex). This is proposed to adapt source extraction pipelines to varying stellar densities from voids to the Galactic center.

Significance. If the results hold, this work provides a practical tool for handling heterogeneous data in the CSST survey and similar future missions. The approach of using DL to gate density-adapted processing could improve the fidelity of astrometric and photometric measurements. The provision of specific numerical performance metrics is a positive aspect of the presentation.

major comments (2)
  1. [Abstract] Abstract: The abstract reports concrete accuracy (98.83%) and MAE (0.0824 dex) figures but supplies no information on training-set size, cross-validation strategy, class imbalance handling, or comparison against non-DL baselines. Without these details the central performance claims cannot be fully evaluated.
  2. [Methods] Methods: The six discrete density categories and the training images are assumed to be representative of actual CSST data across the full dynamic range, but the paper does not provide verification or details on how the image generation process matches the instrument's PSF, noise, and crowding statistics, particularly in the high-density tail (>10^5 stars).
minor comments (2)
  1. [Abstract] Consider adding a sentence on the overall dataset characteristics or number of images used for training and testing.
  2. Ensure all acronyms are defined at first use (e.g., CSST, ResNet).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the presentation of our work on adaptive processing for CSST data. We respond to each major comment below and indicate planned revisions.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The abstract reports concrete accuracy (98.83%) and MAE (0.0824 dex) figures but supplies no information on training-set size, cross-validation strategy, class imbalance handling, or comparison against non-DL baselines. Without these details the central performance claims cannot be fully evaluated.

    Authors: We agree that the abstract would benefit from additional context on these aspects to allow readers to evaluate the claims more readily. The full details on training-set size, cross-validation, class imbalance handling via weighted sampling, and comparisons to non-DL baselines are provided in the Methods and Results sections. In the revised manuscript we will expand the abstract with a concise statement summarizing the dataset scale, validation approach, and that the model outperforms traditional density estimation methods, while respecting length constraints. revision: yes

  2. Referee: [Methods] Methods: The six discrete density categories and the training images are assumed to be representative of actual CSST data across the full dynamic range, but the paper does not provide verification or details on how the image generation process matches the instrument's PSF, noise, and crowding statistics, particularly in the high-density tail (>10^5 stars).

    Authors: The Methods section describes the simulation framework used to generate the training images based on CSST instrument specifications. To strengthen this, we will add explicit verification details, including quantitative comparisons of the simulated PSF, noise properties, and crowding statistics against expected CSST performance, with particular attention to the high-density regime. This will confirm representativeness across the full dynamic range from voids to the Galactic center. revision: yes

Circularity Check

0 steps flagged

No circularity: standard supervised ML training on external/simulated images yields independent performance metrics.

full rationale

The paper trains a ResNet-34 classifier and ResNet-50 regressor on curated or simulated CSST-like images to predict discrete density bins and log star counts. Reported figures (98.83% accuracy, 0.0824 dex MAE) are empirical test-set statistics, not quantities defined in terms of the model outputs or fitted parameters by construction. No self-citation chain, ansatz smuggling, or renaming of known results appears in the provided derivation; the six categories are defined by star counts per detector, an external input. The pipeline is self-contained against external benchmarks once the training distribution is accepted.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the chosen density categories and bright-star magnitude cut are physically meaningful and that the simulated or observed training images adequately sample the CSST instrument response; no explicit free parameters or invented entities are stated in the abstract.

axioms (1)
  • domain assumption The six discrete stellar-density categories form a sufficient and stable partitioning of the dynamic range encountered by CSST.
    Invoked by the classification stage description; if the categories do not align with actual error regimes, the decision gate loses utility.

pith-pipeline@v0.9.0 · 5795 in / 1334 out tokens · 39272 ms · 2026-05-19T22:41:56.766657+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

52 extracted references · 52 canonical work pages · 5 internal anchors

  1. [1]

    M., Abdalla, F., Allam, S., et al

    Abbott, T. M., Abdalla, F., Allam, S., et al. 2018, The Astrophysical Journal Supplement Series, 239, 18

  2. [2]

    2019, Astronomy & Astrophysics, 627, A23

    Adam, R., Vannier, M., Maurogordato, S., et al. 2019, Astronomy & Astrophysics, 627, A23

  3. [3]

    2019, in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 2623–2631 17

    Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. 2019, in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 2623–2631 17

  4. [4]

    2025, Mock Observations for the CSST Mission: End-to-End Performance Modeling of Optical System, https://arxiv.org/abs/2511.06936

    Ban, Z., Li, X.-B., Yang, X., et al. 2025, Mock Observations for the CSST Mission: End-to-End Performance Modeling of Optical System, https://arxiv.org/abs/2511.06936

  5. [5]

    2025, arXiv preprint arXiv:2511.03064

    Bazzanini, L., Angora, G., Bergamini, P., et al. 2025, arXiv preprint arXiv:2511.03064

  6. [6]

    C., Kulkarni, S

    Bellm, E. C., Kulkarni, S. R., Graham, M. J., et al. 2018, Publications of the Astronomical Society of the Pacific, 131, 018002

  7. [7]

    1996, Astronomy and astrophysics supplement series, 117, 393

    Bertin, E., & Arnouts, S. 1996, Astronomy and astrophysics supplement series, 117, 393

  8. [8]

    2019, Astronomical Data Analysis Software and Systems XXVII, 523, 521

    Bosch, J., AlSayyad, Y., Armstrong, R., et al. 2019, Astronomical Data Analysis Software and Systems XXVII, 523, 521

  9. [9]

    J., Aleo, P

    Burke, C. J., Aleo, P. D., Chen, Y.-C., et al. 2019, Monthly Notices of the Royal Astronomical Society, 490, 3952

  10. [10]

    The Pan-STARRS1 Surveys

    Chambers, K. C., et al. 2016, arXiv preprint arXiv:1612.05560

  11. [11]

    Collaboration, D. E. S., Abbott, T., Abdalla, F., et al. 2016, Monthly Notices of the Royal Astronomical Society, 460, 1270

  12. [12]

    2012, Research in Astronomy and Astrophysics, 12, 1197

    Cui, X.-Q., et al. 2012, Research in Astronomy and Astrophysics, 12, 1197

  13. [13]

    2025, Research in Astronomy and Astrophysics, 25, 104003

    Du, Z.-J., Li, Q.-Q., Rui, Y.-C., et al. 2025, Research in Astronomy and Astrophysics, 25, 104003

  14. [14]

    S., & Abadi, H

    Ebrahimi, M. S., & Abadi, H. K. 2021, in Intelligent Computing: Proceedings of the 2021 Computing

  15. [15]

    2022, A&A, 662, A112, doi: 10.1051/0004-6361/202141938

    Conference, Volume 2, Springer, 754–763 Euclid Collaboration, Scaramella, R., Amiaux, J., et al. 2022, A&A, 662, A112, doi: 10.1051/0004-6361/202141938

  16. [16]

    2021, Forests, 12, 212

    Gao, M., Qi, D., Mu, H., & Chen, J. 2021, Forests, 12, 212

  17. [17]

    2025, Science China

    Gong, Y., Miao, H., Zhou, X., et al. 2025, Science China

  18. [18]

    2026, Science China

    Gong, Y., Miao, H., Zhan, H., et al. 2026, Science China

  19. [19]

    2021, The Astrophysical Journal Letters, 911, L33

    Mustafa, M. 2021, The Astrophysical Journal Letters, 911, L33

  20. [20]

    2016, in Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778

    He, K., Zhang, X., Ren, S., & Sun, J. 2016, in Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778

  21. [21]

    Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. 2017, in Proceedings of the IEEE conference on computer vision and pattern recognition, 4700–4708

  22. [22]

    Huber, P. J. 1992, in Breakthroughs in statistics: Methodology and distribution (Springer), 492–518

  23. [23]

    2015, in International conference on machine learning, pmlr, 448–456 Ivezić, Ž., Kahn, S

    Ioffe, S., & Szegedy, C. 2015, in International conference on machine learning, pmlr, 448–456 Ivezić, Ž., Kahn, S. M., Tyson, J. A., et al. 2019, The Astrophysical Journal, 873, 111

  24. [24]

    2020, AJ, 159, 212, doi: 10.3847/1538-3881/ab800a

    Jia, P., Liu, Q., & Sun, Y. 2020, AJ, 159, 212, doi: 10.3847/1538-3881/ab800a

  25. [25]

    Krizhevsky, A., Sutskever, I., & Hinton, G. E. 2012, Advances in neural information processing systems, 25

  26. [26]

    Euclid Definition Study Report

    Laureijs, R., Amiaux, J., Arduini, S., et al. 2011, arXiv preprint arXiv:1110.3193

  27. [27]

    Decoupled Weight Decay Regularization

    Loshchilov, I., & Hutter, F. 2017, arXiv preprint arXiv:1711.05101

  28. [28]

    R., et al

    Majewski, S. R., et al. 2017, The Astronomical Journal, 154, 94

  29. [29]

    2018, Astronomy and Computing, 24, 129

    Melchior, P., Moolekamp, F., Jerdee, M., et al. 2018, Astronomy and Computing, 24, 129

  30. [30]

    Mixed Precision Training

    Micikevicius, P., Narang, S., Alben, J., et al. 2017, arXiv preprint arXiv:1710.03740

  31. [31]

    2024, Sensors, 24, 5203

    Monti, L., Muraveva, T., Clementini, G., & Garofalo, A. 2024, Sensors, 24, 5203

  32. [32]

    Nair, V., & Hinton, G. E. 2010, in Proceedings of the 27th international conference on machine learning (ICML-10), 807–814

  33. [33]

    2002, Practical assessment, research, and evaluation, 8

    Osborne, J. 2002, Practical assessment, research, and evaluation, 8

  34. [34]

    T., Gaudi, B

    Penny, M. T., Gaudi, B. S., Kerins, E., et al. 2019, The Astrophysical Journal Supplement Series, 241, 3

  35. [35]

    G., et al

    Prusti, T., De Bruijne, J., Brown, A. G., et al. 2016, Astronomy & astrophysics, 595, A1

  36. [36]

    2021, Astronomy & astrophysics, 649, A3

    Riello, M., De Angeli, F., Evans, D., et al. 2021, Astronomy & astrophysics, 649, A3

  37. [37]

    2021, in 2021 IEEE international conference on electronics, computing and communication technologies (CONECCT), IEEE, 01–06

    Sandeep, V., Sen, S., & Santosh, K. 2021, in 2021 IEEE international conference on electronics, computing and communication technologies (CONECCT), IEEE, 01–06

  38. [38]

    Shorten, C., & Khoshgoftaar, T. M. 2019, Journal of big data, 6, 1

  39. [39]

    Very Deep Convolutional Networks for Large-Scale Image Recognition

    Simonyan, K., & Zisserman, A. 2014, arXiv preprint arXiv:1409.1556

  40. [40]

    2014, The journal of machine learning research, 15, 1929

    Salakhutdinov, R. 2014, The journal of machine learning research, 15, 1929

  41. [41]

    Stetson, P. B. 1987, Publications of the Astronomical Society of the Pacific, 99, 191

  42. [42]

    2018, LSST Data Management Tech

    Suberlak, K., Slater, C., & Ivezic, Z. 2018, LSST Data Management Tech. Note

  43. [43]

    2025, Artificial Intelligence Review, 58, 195

    Terven, J., Cordova-Esparza, D.-M., Romero-González, J.-A., Ramírez-Pedraza, A., & Chávez-Urbiola, E. 2025, Artificial Intelligence Review, 58, 195

  44. [44]

    Wei, G.-L

    Wei, C.-L., Li, G.-L., Fang, Y.-D., et al. 2025, Mock Observations for the CSST Mission: Main Surveys–An Overview of Framework and Simulation Suite, https://arxiv.org/abs/2511.06970

  45. [45]

    2019, Pattern recognition, 90, 119 18

    Wu, Z., Shen, C., & Van Den Hengel, A. 2019, Pattern recognition, 90, 119 18

  46. [46]

    2025, Mock Observations for the CSST Mission: Multi-Channel Imager–The Cluster Field, https://arxiv.org/abs/2511.06928

    Xie, Y., Chen, X., Feng, S., et al. 2025, Mock Observations for the CSST Mission: Multi-Channel Imager–The Cluster Field, https://arxiv.org/abs/2511.06928

  47. [47]

    2024, Bioengineering, 11, 1034

    Xu, Y., Quan, R., Xu, W., et al. 2024, Bioengineering, 11, 1034

  48. [48]

    G., Adelman, J., Anderson Jr, J

    York, D. G., Adelman, J., Anderson Jr, J. E., et al. 2000, The Astronomical Journal, 120, 1579

  49. [49]

    D., et al

    Zasowski, G., Cohen, R., Chojnowski, S. D., et al. 2017, The Astronomical Journal, 154, 198

  50. [50]

    2021, Chinese Science Bulletin (Chinese Version), 66, 1290

    Zhan, H. 2021, Chinese Science Bulletin (Chinese Version), 66, 1290

  51. [51]

    2026, arXiv preprint arXiv:2601.11054

    Zhang, Z., Li, N., Mao, S., et al. 2026, arXiv preprint arXiv:2601.11054. https://arxiv.org/abs/2601.11054

  52. [52]

    2012, Research in Astronomy and Astrophysics, 12, 723

    Zhao, G., et al. 2012, Research in Astronomy and Astrophysics, 12, 723