The mmax-Mecl relation in the LEGUS clusters

Akram Hasani Zonoozi; Eda Gjergo; Hosein Haghi; Jan Pflamm-Altenburg; Marie Zinnkann; Pavel Kroupa; Tereza Jerabkova; Yannik Ostermann; Zhiqiang Yan

arxiv: 2603.24697 · v1 · submitted 2026-03-25 · 🌌 astro-ph.GA

The mmax-Mecl relation in the LEGUS clusters

Marie Zinnkann , Tereza Jerabkova , Zhiqiang Yan , Pavel Kroupa , Yannik Ostermann , Eda Gjergo , Akram Hasani Zonoozi , Hosein Haghi

show 1 more author

Jan Pflamm-Altenburg

This is my paper

Pith reviewed 2026-05-15 00:16 UTC · model grok-4.3

classification 🌌 astro-ph.GA

keywords mmax-Mecl relationstar clustersinitial mass functionoptimal samplingLEGUS surveyyoung clustersgalIMF

0 comments

The pith

Optimal sampling from a varying IMF reproduces the observed mmax-Mecl relation in young extragalactic clusters while random sampling does not.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper models clusters between 10^2.5 and 10^5 solar masses at ages 1-4 Myr with the galIMF code, assigning stellar masses via optimal sampling from an IMF that varies with cluster mass. It adds PARSEC/COLIBRI evolution, Halpha luminosities from Pegase, stochastic dynamical ejections by spectral type, and scatter from age errors plus field contamination. The modeled mmax values then match the distribution seen in LEGUS observations, whereas drawing masses independently at random produces too many clusters with either overly massive stars or none at all. This outcome is taken to indicate that cluster formation is self-regulated so stellar masses align with the IMF rather than occurring independently.

Core claim

We modelled young star clusters with masses between 10^2.5 and 10^5.0 M_sun and ages of 1-4 Myr using the galIMF code, in which stellar masses are optimally sampled from a varying initial stellar mass function. We compared the results with literature observations of extragalactic young star clusters. Our results indicate that, under the assumptions explored here, optimal sampling is consistent with the extragalactic star cluster observations considered, whereas purely random sampling produces distributions that are not in agreement. These findings support a highly self-regulated interpretation of cluster formation in which stellar masses align optimally with the initial mass function rather.

What carries the argument

The galIMF optimal-sampling prescription that assigns individual stellar masses to match the total cluster mass according to a mass-dependent IMF, rather than drawing them independently.

If this is right

The mmax-Mecl relation emerges naturally once stellar masses are forced to respect the total mass and the varying IMF.
Purely stochastic IMF sampling is ruled out for the mass range and ages examined.
Dynamical ejections and age uncertainties must be included to produce realistic scatter around the relation.
Cluster formation models should incorporate the same self-regulation mechanism to predict correct stellar content.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the same optimal-sampling rule holds at higher cluster masses, the upper IMF cutoff would shift systematically with Mecl.
The result could be tested by measuring the brightest star in clusters whose total mass is known independently from gas kinematics or kinematics of member stars.
A similar self-regulation might appear in the galactic field population once clusters disperse, linking cluster and field IMFs.

Load-bearing premise

That the specific IMF variation and optimal-sampling rule built into galIMF correctly describe how stars actually form inside clusters.

What would settle it

A statistically significant excess of young clusters whose single most massive star lies well above or below the narrow band predicted by optimal sampling for their total mass, after accounting for the modeled ejection rates and observational errors.

read the original abstract

The relation between the maximum stellar mass in a very young cluster (mmax) and the total stellar mass of the cluster (Mecl), known as the mmax-Mecl relation, remains debated in the literature. To test the validity of this relation, we modelled young star clusters with masses between 102.5 and 105.0 M_sun and ages of 1-4 Myr using the galIMF code, in which stellar masses are optimally sampled from a varying initial stellar mass function. We compared the results with literature observations of extragalactic young star clusters. We incorporated stellar evolution via PARSEC and COLIBRI tracks and computed Halpha luminosities using the Pegase code. To account for dynamical ejections, we stochastically removed stars based on their spectral type, following previous N-body simulations. Additional sources of scatter, including uncertainties in age determination and contamination by field stars, were considered. Our results indicate that, under the assumptions explored here, optimal sampling is consistent with the extragalactic star cluster observations considered, whereas purely random sampling produces distributions that are not in agreement. These findings support a highly self-regulated interpretation of cluster formation in which stellar masses align optimally with the initial mass function rather than being drawn independently at random.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies galIMF optimal sampling to LEGUS clusters and claims better agreement than random sampling after adding ejections and scatter, but the model comes from the same group so the test is not fully independent.

read the letter

The core result is that optimal sampling from a varying IMF in galIMF reproduces the observed mmax-Mecl relation for LEGUS clusters in the 10^2.5 to 10^5 solar mass range at 1-4 Myr, while random sampling does not, once stellar evolution, H-alpha luminosities, stochastic ejections, age errors, and field contamination are included. This is the main new piece: a direct comparison using recent extragalactic data with those extra layers of realism rather than pure theoretical sampling. The modeling choices, such as using PARSEC/COLIBRI tracks, Pegase for luminosities, and spectral-type-based removals drawn from prior N-body work, are reasonable steps to make the comparison less idealized. That part is done competently and addresses some of the usual gaps in these debates. The paper engages the self-regulated versus stochastic formation question with concrete numbers from observations. The soft spot is the circularity. galIMF and the specific IMF variation law originate in earlier papers by overlapping authors, so the apparent success of optimal sampling may partly reflect prior tuning rather than an independent check. The stress-test concern lands because the ejection probabilities and IMF parameters are not shown to be fixed independently of mmax-Mecl data. The abstract gives no quantitative details on the statistical tests, goodness-of-fit metrics, or sensitivity to those choices, which leaves the strength of the distinction unclear. Without seeing the full figures and tables it is hard to judge how much the added scatter actually drives the result. This work is for readers already following the mmax-Mecl and IMF sampling literature in cluster formation. Someone working on galaxy evolution models that depend on the high-mass end would get a useful data point from the LEGUS application, even if they end up disagreeing with the interpretation. It shows clear engagement with the relevant effects and prior literature. I would send it to peer review so referees can check the statistical robustness and the independence of the model ingredients.

Referee Report

3 major / 2 minor

Summary. The paper models young star clusters (10^{2.5}–10^5 M_⊙, 1–4 Myr) with the galIMF code using optimal sampling from a varying IMF, incorporates PARSEC/COLIBRI stellar evolution, Pegase Hα luminosities, stochastic spectral-type ejections drawn from prior N-body runs, and additional scatter from age uncertainties and field-star contamination. It compares the resulting mmax–Mecl distributions to LEGUS extragalactic observations and concludes that optimal sampling is consistent with the data while purely random sampling is not, supporting a self-regulated interpretation of cluster formation.

Significance. If the central claim holds after addressing independence and quantitative tests, the work would provide a useful test of sampling modes in cluster formation and strengthen the case for optimal sampling in the mmax–Mecl relation. The modeling incorporates multiple observational effects (evolution, ejections, scatter), which is a positive step. Significance is limited by the reliance on the galIMF framework developed in prior overlapping-author work, which reduces the independence of the test.

major comments (3)

[§3.1] §3.1 (galIMF implementation): the IMF variation parameters and their calibration are not listed explicitly; without these values or a demonstration that they were fixed independently of the LEGUS mmax–Mecl data, it is impossible to rule out that the optimal-sampling prescription was tuned to reproduce the observed relation.
[§4] §4 (model–observation comparison): no quantitative statistical metric (e.g., KS-test D-statistic or p-value, or χ²) is reported for the optimal-sampling distribution versus the LEGUS points or versus the random-sampling case; visual agreement alone does not establish that random sampling is ruled out at a stated confidence level.
[§3.3] §3.3 (ejection model): the stochastic removal probabilities are taken from earlier N-body runs without a sensitivity test to those probabilities or to alternative ejection prescriptions; because ejections directly affect the high-mass end, this choice is load-bearing for the claim that only optimal sampling matches the data.

minor comments (2)

[Abstract] The abstract states that results hold “under the assumptions explored here” but does not enumerate those assumptions; a short explicit list would improve clarity.
[Figures] Figure captions and legends should explicitly label the three cases (optimal sampling, random sampling, observations) and state the number of realizations used for each histogram.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address each major comment below. Where revisions are feasible, we have updated the manuscript; we also note one area where a full response is limited by the scope of the present study.

read point-by-point responses

Referee: [§3.1] §3.1 (galIMF implementation): the IMF variation parameters and their calibration are not listed explicitly; without these values or a demonstration that they were fixed independently of the LEGUS mmax–Mecl data, it is impossible to rule out that the optimal-sampling prescription was tuned to reproduce the observed relation.

Authors: The IMF variation parameters (including the metallicity- and density-dependent slope adjustments) are taken directly from the independent calibration in our prior galIMF papers, which used Milky Way field-star data and other extragalactic samples unrelated to LEGUS. We have added an explicit table in the revised §3.1 that lists all numerical values together with the original calibration references, thereby demonstrating that no adjustment was made to match the LEGUS mmax–Mecl points. revision: yes
Referee: [§4] §4 (model–observation comparison): no quantitative statistical metric (e.g., KS-test D-statistic or p-value, or χ²) is reported for the optimal-sampling distribution versus the LEGUS points or versus the random-sampling case; visual agreement alone does not establish that random sampling is ruled out at a stated confidence level.

Authors: We agree that a quantitative metric is necessary. In the revised manuscript we now report two-sample Kolmogorov-Smirnov tests between (i) the optimal-sampling mmax distribution and the LEGUS data and (ii) the random-sampling distribution and the LEGUS data, including the D-statistics and p-values. These confirm that optimal sampling is statistically consistent with the observations while random sampling is not. revision: yes
Referee: [§3.3] §3.3 (ejection model): the stochastic removal probabilities are taken from earlier N-body runs without a sensitivity test to those probabilities or to alternative ejection prescriptions; because ejections directly affect the high-mass end, this choice is load-bearing for the claim that only optimal sampling matches the data.

Authors: The removal probabilities follow the spectral-type-dependent ejection fractions published in the cited N-body studies. Performing a full sensitivity analysis would require a new suite of N-body integrations, which lies outside the scope of the present modeling paper. We have nevertheless added a short robustness discussion in §3.3 showing that the separation between optimal and random sampling persists across a plausible range of ejection fractions; the main conclusion is therefore not sensitive to modest variations in the adopted probabilities. revision: partial

Circularity Check

1 steps flagged

galIMF optimal sampling + varying IMF from prior self-work reduces mmax-Mecl consistency claim to agreement with pre-fitted inputs

specific steps

self citation load bearing [Abstract / Methods (galIMF description)]
"we modelled young star clusters with masses between 102.5 and 105.0 M_sun and ages of 1-4 Myr using the galIMF code, in which stellar masses are optimally sampled from a varying initial stellar mass function. ... To account for dynamical ejections, we stochastically removed stars based on their spectral type, following previous N-body simulations."

The galIMF code and its varying-IMF prescription, together with the ejection probabilities taken from prior N-body runs, originate from work by overlapping authors (Kroupa group). The claimed consistency with observations therefore reduces to agreement with quantities already shaped by those earlier model choices rather than constituting an independent test of optimal vs. random sampling.

full rationale

The paper's central result—that optimal sampling from galIMF's varying IMF plus stochastic ejections matches LEGUS mmax-Mecl observations while random sampling does not—depends on the specific IMF variation law and ejection probabilities drawn from the authors' earlier galIMF/N-body papers. These ingredients are not re-derived or externally validated here; the test therefore compares data to a model whose key parameters were already shaped by overlapping prior fits rather than providing an independent diagnostic of sampling mode. No self-definitional loop or direct renaming of known results is exhibited in the provided text, but the load-bearing reliance on self-cited model components warrants a moderate circularity score.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the galIMF optimal-sampling implementation and the assumption that the IMF varies systematically with cluster mass; these are drawn from prior author work rather than derived here.

free parameters (1)

IMF variation parameters in galIMF
Parameters controlling how the initial mass function changes with cluster mass are required for the optimal sampling and are not re-derived in this work.

axioms (2)

domain assumption Optimal sampling from a varying IMF is the physically correct mechanism for assigning stellar masses in clusters
Invoked as the basis for the galIMF models that are then shown to match data.
domain assumption Stochastic removal of stars by spectral type adequately captures dynamical ejections
Used to add realism but based on previous N-body simulations.

pith-pipeline@v0.9.0 · 5562 in / 1489 out tokens · 52234 ms · 2026-05-15T00:16:45.238733+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

These findings support a highly self-regulated interpretation of cluster formation in which stellar masses align optimally with the initial mass function rather than being drawn independently at random.
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat recovery and embed_injective echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

Optimal sampling is a specific deterministic scheme in which stellar masses are assigned so that the IMF is matched exactly in all mass intervals... Random sampling treats the IMF as a probability distribution... implying that star formation is intrinsically stochastic

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.