pith. sign in

arxiv: 2605.04028 · v1 · submitted 2026-05-05 · 🌌 astro-ph.GA · astro-ph.CO

A Multi-parameter Fuzzy Set Framework for Classifying Red, Blue, and Green Valley Galaxies

Pith reviewed 2026-05-07 03:25 UTC · model grok-4.3

classification 🌌 astro-ph.GA astro-ph.CO
keywords fuzzy classificationgalaxy populationsred sequencegreen valleyblue cloudstar formationmorphologyclustering
0
0 comments X

The pith

Fuzzy classification assigns galaxies continuous degrees of membership to red, blue, and green populations based on multiple properties.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents a fuzzy set framework for classifying galaxies into red sequence, blue cloud, and green valley categories. It derives smooth membership functions from the bimodal distributions in color, star formation rate, and a spectral index using statistical modeling. These functions are combined conservatively across parameters. A reader would care because hard cuts in single properties often mix populations, while this method aims for cleaner samples to trace how galaxies quench their star formation and change shape over time.

Core claim

The authors establish that applying the multi-parameter fuzzy classification to a large sample of galaxies produces red populations with unimodal low star formation rates, green valley galaxies with clearer morphological transition signatures, and overall reduced contamination compared to earlier hard-cut schemes.

What carries the argument

Sigmoidal membership functions derived from Gaussian mixture modeling of bimodal distributions in color, specific star formation rate, and D4000, combined via the minimum operator to assign continuous membership degrees.

If this is right

  • Red galaxies display a single-peaked distribution at low specific star formation rates.
  • Green-valley galaxies exhibit more distinct morphological evolution signatures.
  • Active galactic nucleus fractions show similar trends with stellar mass as in prior classifications.
  • Fuzzy red galaxies display stronger large-scale clustering, linking them to denser environments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could improve studies of galaxy quenching by providing less mixed transitional samples.
  • Subtle clustering differences hint that environment plays a stronger role in the red sequence than hard cuts reveal.
  • Extending the framework to additional observables like metallicity might further refine the separation of populations.
  • Such classifications may help model the assembly history of galaxies in simulations more accurately.

Load-bearing premise

The observed bimodal distributions in the chosen galaxy properties arise from distinct physical populations that Gaussian mixtures can model to create accurate membership functions.

What would settle it

Finding that the fuzzy green-valley galaxies show no clearer morphological transition features than hard-cut ones when examined in high-resolution images from an independent survey would falsify the improvement claim.

Figures

Figures reproduced from arXiv: 2605.04028 by Amit Mondal, Biswajit Pandey.

Figure 1
Figure 1. Figure 1: Probability density distributions of galaxy properties for a sample drawn from SDSS. The gray histograms show the data. The olive and cyan curves represent the two Gaussian components from the double-Gaussian fits, while the maroon curve shows their sum (the total model). From left to right, the panels correspond to the (u − r) color, log10  M∗ M⊙  , log10  sSFR Gyr−1  , and D4000 view at source ↗
Figure 2
Figure 2. Figure 2: Membership functions derived for different galaxy properties. The panels show the redness (red), greenness (green), and blueness (blue) membership functions for each observable. From left to right, the properties are the (u − r) colour, log10  sSFR Gyr−1  , and D4000. The factor 2 is used for normalization (Pandey, 2020). To obtain the overall redness membership when multiple galaxy properties are used, … view at source ↗
Figure 3
Figure 3. Figure 3: The left panel of the figure shows the stellar mass distribution, middle panel shows the sSFR distribution, and the right panel shows the concentration index distribution for green valley galaxies identified using the fuzzy based classification (solid line) method and the method proposed by Schawinski et al. (2014) (dashed line). 10.0 10.5 11.0 11.5 log10 ( M M ) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 AGN… view at source ↗
Figure 4
Figure 4. Figure 4: The figure shows the AGN fraction as a function of log10 M∗ M⊙  for green valley galaxies identified using the fuzzy based classification method (solid line) and the method proposed by Schawinski et al. (2014) (dashed line). Errorbars show 1σ uncertainties using beta distribution quantile technique view at source ↗
Figure 5
Figure 5. Figure 5: Two-point correlation function, ξ(r), as a function of separation r for red, blue, and green galaxies, using our fuzzy based method and for the galaxies classified using the method proposed by Schawinski et al. (2014). Planck-Institute for Astrophysics (MPA), New Mexico State University, Ohio State University, University of Pittsburgh, University of Portsmouth, Princeton University, the United States Naval… view at source ↗
read the original abstract

We present a data-driven fuzzy set framework for classifying galaxies into the red sequence, blue cloud, and green-valley populations using multiple observables from the Sloan Digital Sky Survey (SDSS DR18). Unlike traditional methods based on hard boundaries in colour or stellar mass, our approach assigns continuous membership degrees using sigmoidal functions derived from bimodal galaxy properties, including $(u-r)$ colour, specific star formation rate (sSFR), and $D4000$. Membership functions are constructed via Gaussian mixture modeling and combined using a conservative fuzzy minimum operator. Applying this method to a volume-limited sample of 88,579 galaxies, we compare with the empirical classification of \citet{schawinski14}. The fuzzy approach reduces contamination in the red and green-valley populations and yields more physically consistent distributions of star formation and morphology. Red galaxies show a unimodal low-sSFR distribution, while green-valley galaxies exhibit clearer signatures of morphological evolution. We also examine the dependence of active galactic nucleus (AGN) fraction on stellar mass and find no significant differences between methods, indicating robust global AGN trends. However, clustering analysis reveals subtle differences: fuzzy-classified red galaxies show enhanced large-scale clustering, suggesting a stronger association with highly biased dark matter halos. These results demonstrate that fuzzy classification provides a flexible, physically motivated alternative to hard-cut methods, enabling a more accurate and interpretable view of galaxy populations and their evolution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper presents a data-driven fuzzy set classification scheme for SDSS DR18 galaxies into red sequence, blue cloud, and green valley populations. Membership functions are derived via Gaussian mixture modeling of the bimodal distributions in (u-r) colour, specific star formation rate (sSFR), and D4000, then combined with the fuzzy minimum operator. Applied to a volume-limited sample of 88,579 galaxies, the method is compared to the hard-cut scheme of Schawinski et al. (2014) and is claimed to reduce contamination in the red and green-valley populations while producing more physically consistent star-formation and morphological distributions; additional results on AGN fractions and large-scale clustering are reported.

Significance. If the improvements in contamination and physical consistency can be independently verified, the multi-parameter fuzzy framework would constitute a useful, flexible alternative to hard boundaries for studying galaxy quenching and evolution. The continuous memberships and use of multiple observables address known limitations of colour or sSFR cuts, and the clustering result hints at possible halo-bias differences. However, the current validation is insufficient to establish these advantages.

major comments (3)
  1. [Abstract] Abstract and results section: the claim that fuzzy red galaxies exhibit a unimodal low-sSFR distribution and therefore more physically consistent star-formation properties is circular. Because sSFR is an explicit input to the GMM-derived membership functions, galaxies assigned high red membership are selected precisely for low sSFR; the resulting histogram is expected by construction and does not constitute independent evidence of reduced contamination.
  2. [Comparison with Schawinski et al. (2014)] Comparison to Schawinski et al. (2014): no quantitative performance metrics (purity, completeness, contamination fraction, or overlap statistics) are provided against any external ground truth such as mock catalogs, spectroscopic classifications, or multi-wavelength tracers. The reported reduction in contamination therefore remains qualitative.
  3. [Methods] Methods: the manuscript contains no error propagation for the GMM parameters, no robustness tests against sample selection or number of components, and no validation on mock catalogs. These omissions leave open whether the sigmoidal membership functions reflect underlying physical populations or are artifacts of the modeling and selection.
minor comments (2)
  1. [Methods] The mathematical definition of the fuzzy minimum operator and the precise procedure for combining the three membership functions should be stated explicitly with an equation.
  2. [Results] Figure captions and axis labels for the sSFR and morphology histograms should indicate whether the distributions are normalized or weighted by membership degree.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and constructive report. The comments identify key areas where our presentation and validation can be strengthened. We address each major comment below, indicating revisions that will be incorporated in the next version of the manuscript. Our responses focus on clarifying interpretations, adding quantitative elements where feasible with the available SDSS data, and improving methodological robustness without misrepresenting the current analysis.

read point-by-point responses
  1. Referee: [Abstract] Abstract and results section: the claim that fuzzy red galaxies exhibit a unimodal low-sSFR distribution and therefore more physically consistent star-formation properties is circular. Because sSFR is an explicit input to the GMM-derived membership functions, galaxies assigned high red membership are selected precisely for low sSFR; the resulting histogram is expected by construction and does not constitute independent evidence of reduced contamination.

    Authors: We agree that the referee's point on circularity is valid and that the low-sSFR distribution for high red-membership galaxies follows in part from the inclusion of sSFR in the GMM. We will revise the abstract and results sections to remove any implication that this distribution provides independent evidence of reduced contamination. Instead, we will frame the result as a demonstration of internal consistency: the multi-parameter fuzzy minimum operator produces red samples in which colour, sSFR, and D4000 are simultaneously aligned at the low-sSFR end, whereas single-parameter hard cuts can admit galaxies that are inconsistent across the other observables. We will also add explicit comparisons of the sSFR histograms under both classification schemes to illustrate the difference in contamination levels visible in the data. revision: partial

  2. Referee: [Comparison with Schawinski et al. (2014)] Comparison to Schawinski et al. (2014): no quantitative performance metrics (purity, completeness, contamination fraction, or overlap statistics) are provided against any external ground truth such as mock catalogs, spectroscopic classifications, or multi-wavelength tracers. The reported reduction in contamination therefore remains qualitative.

    Authors: We acknowledge that the current comparison is primarily qualitative. In the revised manuscript we will add quantitative overlap statistics, including the fraction of galaxies receiving high membership in one class under the fuzzy scheme but classified differently by the Schawinski et al. (2014) hard cuts. We will also report the fraction of fuzzy red galaxies that show detectable H-alpha emission or other star-formation tracers as a proxy for residual contamination, and apply two-sample Kolmogorov-Smirnov tests to the morphological and sSFR distributions of the two red samples. While independent mock catalogs with known quenching histories are not part of the present study, these internal metrics using SDSS observables will make the claimed reduction in contamination more quantitative. revision: yes

  3. Referee: [Methods] Methods: the manuscript contains no error propagation for the GMM parameters, no robustness tests against sample selection or number of components, and no validation on mock catalogs. These omissions leave open whether the sigmoidal membership functions reflect underlying physical populations or are artifacts of the modeling and selection.

    Authors: We thank the referee for highlighting these methodological gaps. The revised manuscript will include: (i) bootstrap-derived uncertainties on the GMM means and variances used to construct the sigmoidal membership functions; (ii) robustness tests repeating the full pipeline on subsamples with altered stellar-mass and redshift cuts; and (iii) explicit checks of stability when the GMM is fit with two versus three components. These additions will be presented in a new subsection of the methods. Validation against mock catalogs with known physical quenching states would be valuable but requires external hydrodynamical simulation data and is outside the scope of this observational paper; we will note this limitation and identify it as a natural direction for follow-up work. revision: partial

Circularity Check

1 steps flagged

sSFR distribution of fuzzy red galaxies reduces to input GMM by construction

specific steps
  1. fitted input called prediction [Abstract (and results section describing sSFR distributions)]
    "Red galaxies show a unimodal low-sSFR distribution, while green-valley galaxies exhibit clearer signatures of morphological evolution. ... The fuzzy approach reduces contamination in the red and green-valley populations and yields more physically consistent distributions of star formation and morphology."

    Membership functions are constructed via Gaussian mixture modeling of the bimodal sSFR distribution; galaxies receiving high red membership are therefore selected for low sSFR by definition. The subsequent claim that fuzzy red galaxies display a unimodal low-sSFR distribution is therefore forced by the input variable used to build the classifier, not an independent demonstration of reduced contamination.

full rationale

The paper derives sigmoidal membership functions for red/blue/green classes directly from GMM fits to the observed bimodal distributions in (u-r), sSFR, and D4000. It then reports that the fuzzy red population exhibits a unimodal low-sSFR distribution as evidence of reduced contamination and greater physical consistency. Because high red membership is assigned precisely to galaxies with low sSFR (the same variable used to fit the GMM), the reported sSFR histogram is a direct consequence of the classification rule rather than an independent test. Morphology provides a partially orthogonal check, but the star-formation consistency claim is circular. No external ground-truth metrics (purity/completeness vs. simulations or independent tracers) are supplied to break the loop. The comparison to Schawinski et al. (2014) hard cuts is presented without quantitative validation against any external benchmark.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the domain assumption that galaxy observables exhibit clear bimodality that GMM can separate into physically meaningful components, plus fitted parameters from the SDSS sample itself. No new physical entities are postulated.

free parameters (1)
  • GMM component means, variances, and weights for each observable
    These are fitted to the empirical distributions of (u-r) colour, sSFR, and D4000 in the volume-limited SDSS sample to define the sigmoidal membership functions.
axioms (2)
  • domain assumption The observed distributions of galaxy colour, specific star formation rate, and D4000 break strength are bimodal and can be decomposed into distinct red and blue components via Gaussian mixture modeling.
    This decomposition is invoked to construct the continuous sigmoidal membership functions for each property.
  • domain assumption The conservative fuzzy minimum operator is an appropriate way to combine membership degrees across multiple observables.
    The paper adopts this operator without deriving it from first principles or testing alternatives.

pith-pipeline@v0.9.0 · 5558 in / 1600 out tokens · 55672 ms · 2026-05-07T03:25:51.098460+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages

  1. [1]

    F., Argudo-Fernández, M., et al

    Almeida, A., Anderson, S. F., Argudo-Fernández, M., et al. 2023, ApJS, 267, 44

  2. [2]

    2019, MNRAS, 488, L99 —

    Angthopo, J., Ferreras, I., & Silk, J. 2019, MNRAS, 488, L99 —. 2020, MNRAS, 495, 2720

  3. [3]

    2013, A&A, 558, A67

    Arnouts, S., Le Floc’h, E., Chevallard, J., et al. 2013, A&A, 558, A67

  4. [4]

    J., & Scott, P

    Asplund, M., Grevesse, N., Sauval, A. J., & Scott, P. 2009, ARA&A, 47, 481

  5. [5]

    K., Balogh, M

    Baldry, I. K., Balogh, M. L., Bower, R. G., et al. 2006, MNRAS, 373, 469

  6. [6]

    L., Baldry, I

    Balogh, M. L., Baldry, I. K., Nichol, R., et al. 2004, ApJ, 615, L101

  7. [7]

    L., Morris, S

    Balogh, M. L., Morris, S. L., Y ee, H. K. C., Carlberg, R. G., & Ellingson, E. 1999, ApJ, 527, 54

  8. [8]

    Baum, W. A. 1959, Publications of the Astronomical Society of the Pacific, 71, 106

  9. [9]

    Bezdek, J. C. 1981, in Advanced Applications in Pattern Recognition

  10. [10]

    R., Hogg, D

    Blanton, M. R., Hogg, D. W., Bahcall, N. A., et al. 2003, ApJ, 594, 186

  11. [11]

    Brambila, D., Lopes, P. A. A., Ribeiro, A. L. B., & Cortesi, A. 2023, MNRAS, 523, 785

  12. [12]

    N., Phillipps, S., Kelvin, L

    Bremer, M. N., Phillipps, S., Kelvin, L. S., et al. 2018, MNRAS, 476, 12

  13. [13]

    Brinchmann, J., Charlot, S., White, S. D. M., et al. 2004, MNRAS, 351, 1151 Bruzual A., G. 1983, ApJ, 273, 105

  14. [14]

    2013, ApJ, 779, L13

    Cimatti, A., Brusa, M., Talia, M., et al. 2013, ApJ, 779, L13

  15. [15]

    J., & Muriel, H

    Coenda, V., Martínez, H. J., & Muriel, H. 2018, MNRAS, 473, 5617

  16. [16]

    E., & White, M

    Conroy, C., Gunn, J. E., & White, M. 2009, ApJ, 699, 486

  17. [17]

    2011, Astronomy & Astrophysics, 535, A10

    Coppa, G., Mignoli, M., Zamorani, G., et al. 2011, Astronomy & Astrophysics, 535, A10

  18. [18]

    A., Baes, M., Bourne, N., et al

    Eales, S. A., Baes, M., Bourne, N., et al. 2018, MNRAS, 481, 1183

  19. [19]

    2023, ApJ, 951, 115

    Estrada-Carpenter, V., Papovich, C., Momcheva, I., et al. 2023, ApJ, 951, 115

  20. [20]

    M., Willmer, C

    Faber, S. M., Willmer, C. N. A., W olf, C., et al. 2007, ApJ, 665, 265

  21. [21]

    2014, A&A, 563, A92

    Fritz, A., Scodeggio, M., Ilbert, O., et al. 2014, A&A, 563, A92

  22. [22]

    1985, Computer Vision, Graphics, and Image Processing, 29, 273

    Kapur, J., Sahoo, P., & W ong, A. 1985, Computer Vision, Graphics, and Image Processing, 29, 273

  23. [23]

    M., White, S

    Kauffmann, G., Heckman, T. M., White, S. D. M., et al. 2003, MNRAS, 341, 54

  24. [24]

    D., & Szalay, A

    Landy, S. D., & Szalay, A. S. 1993, ApJ, 412, 64 lopatin, A. G., Brykov, B. A., & Vent, D. P. 2021, in 2021 International Russian Automation Conference (RusAutoCon), 84–88

  25. [25]

    B., Neto, E

    Lugli, A. B., Neto, E. R., Henriques, J. P. C., et al. 2016, International Journal of Innovative Computing, Information and Control, 12, 665 Mähönen, P., & Frantti, T. 2000, ApJ, 541, 261

  26. [26]

    C., Wyder, T

    Martin, D. C., Wyder, T. K., Schiminovich, D., et al. 2007, ApJS, 173, 342

  27. [27]

    L., Mosleh, M., Romer, A

    Masters, K. L., Mosleh, M., Romer, A. K., et al. 2010, MNRAS, 405, 783

  28. [28]

    Nandra, K., Georgakakis, A., Willmer, C. N. A., et al. 2007, ApJ, 660, L11

  29. [29]

    2022, MNRAS, 512, 3566

    Noirot, G., Sawicki, M., Abraham, R., et al. 2022, MNRAS, 512, 3566

  30. [30]

    2025, arXiv e-prints, arXiv:2512.20379

    Nyiransengiyumva, B., Povic, M., Nkundabakura, P., Mutabazi, T., & Ma- horo, A. 2025, arXiv e-prints, arXiv:2512.20379

  31. [31]

    1979, IEEE Transactions on Systems, Man, and Cybernetics, 9, 62

    Otsu, N. 1979, IEEE Transactions on Systems, Man, and Cybernetics, 9, 62

  32. [32]

    2020, MNRAS, 499, L31 —

    Pandey, B. 2020, MNRAS, 499, L31 —. 2023, Astronomy and Computing, 44, 100725 —. 2024, MNRAS, 530, 4550

  33. [33]

    2020, MNRAS, 498, 6069

    Pandey, B., & Sarkar, S. 2020, MNRAS, 498, 6069

  34. [34]

    2011, Journal of Machine Learning Research, 12, 2825

    Pedregosa, F., Varoquaux, G., Gramfort, A., et al. 2011, Journal of Machine Learning Research, 12, 2825

  35. [35]

    2022, A&A, 666, A170

    Quilley, L., & de Lapparent, V. 2022, A&A, 666, A170

  36. [36]

    1979, Information and control, 40, 76 —

    Rosenfeld, A. 1979, Information and control, 40, 76 —. 1984, Pattern Recognition Letters, 2, 311

  37. [37]

    2014, Serbian Astronomical Journal, 189, 1

    Salim, S. 2014, Serbian Astronomical Journal, 189, 1

  38. [38]

    2009, MNRAS, 396, 818

    Schawinski, K., Lintott, C., Thomas, D., et al. 2009, MNRAS, 396, 818

  39. [39]

    M., Simmons, B

    Schawinski, K., Urry, C. M., Simmons, B. D., et al. 2014, MNRAS, 440, 889

  40. [40]

    2001, AJ, 122, 1238

    Shimasaku, K., Fukugita, M., Doi, M., et al. 2001, AJ, 122, 1238

  41. [41]

    1992, AJ, 103, 2102

    Spiekermann, G. 1992, AJ, 103, 2102

  42. [42]

    H., Bernardi, M., et al

    Stoughton, C., Lupton, R. H., Bernardi, M., et al. 2002, AJ, 123, 485

  43. [43]

    R., et al

    Strateva, I., Ivezić, Ž., Knapp, G. R., et al. 2001, AJ, 122, 1861

  44. [44]

    N., Hopkins, A

    Taylor, E. N., Hopkins, A. M., Baldry, I. K., et al. 2015, MNRAS, 446, 2144

  45. [45]

    1981, A&A, 100, L20

    Visvanathan, N. 1981, A&A, 100, L20

  46. [46]

    1988, Computers in Industry, 10, 35

    Wakileh, B., & Gill, K. 1988, Computers in Industry, 10, 35

  47. [47]

    J., Quadri, R

    Williams, R. J., Quadri, R. F., Franx, M., van Dokkum, P., & Labbé, I. 2009, ApJ, 691, 1879

  48. [48]

    K., Martin, D

    Wyder, T. K., Martin, D. C., Schiminovich, D., et al. 2007, ApJS, 173, 293 Y ork, D. G., Adelman, J., Anderson, Jr., J. E., et al. 2000, AJ, 120, 1579

  49. [49]

    1965, Information and Control, 8, 338

    Zadeh, L. 1965, Information and Control, 8, 338

  50. [50]

    Zadeh, L. A. 1973, IEEE Transactions on Systems, Man, and Cybernetics, SMC-3, 28

  51. [51]

    2021, A&A, 650, A155

    Zhang, Z., Wang, H., Luo, W., et al. 2021, A&A, 650, A155