pith. sign in

arxiv: 2604.16273 · v1 · submitted 2026-04-17 · 🌌 astro-ph.GA

The DESIRED strong-line calibrations: I. New empirical metallicity relations for the local and high-redshift universe

Pith reviewed 2026-05-10 07:34 UTC · model grok-4.3

classification 🌌 astro-ph.GA
keywords strong-line metallicity calibrationsHII regionsgalaxy chemical abundanceshigh-redshift galaxieselectron temperatureabundance discrepancyemission-line diagnostics
0
0 comments X

The pith

A large diverse sample of HII regions and galaxies produces strong-line metallicity calibrations that apply equally well at high redshift without special adjustments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compiles spectra from 2392 HII regions and star-forming galaxies, including 67 at z>2, all with direct electron-temperature metallicity measurements drawn from 201 literature sources. It derives 27 new empirical calibrations relating strong optical emission-line ratios to oxygen, nitrogen, sulphur, argon and neon abundances, each given with validity ranges and typical scatter of 0.15-0.35 dex. Relations are supplied both for the uniform-temperature case and for cases that include temperature fluctuations, allowing better consistency between collisionally excited and recombination lines. When tested against prior calibrations and recent JWST high-redshift proposals, the new relations cover the widest metallicity interval while remaining consistent with the data, showing that the mixed sample already captures the ionization conditions encountered at high redshift. The central result is therefore that broad sample composition, rather than redshift-targeted recalibration, is the key requirement for reliable abundance estimates over cosmic time.

Core claim

Using the DESIRED database of 2392 spectra with homogeneously re-derived physical conditions and abundances, we construct 27 empirical strong-line calibrations spanning 12+log(O/H) from 6.79 to 9.07. These cover multiple element-based diagnostics and are presented for both t^{2}=0 and t^{2}>0, reconciling recombination-line and collisionally-excited-line abundances. The relations match or exceed the validity range of earlier calibrations and show that recently proposed high-redshift relations lie within the same scatter, indicating that sample diversity alone suffices for consistent metallicity determinations from the local universe to z>2.

What carries the argument

The DESIRED compilation of 2392 spectra with homogeneous direct-temperature abundances, used as the training set for fitting 27 new strong-line ratio versus metallicity relations.

If this is right

  • The 27 calibrations can be applied directly to strong-line observations of galaxies across the stated metallicity range to obtain oxygen and other element abundances.
  • Separate versions for t^{2}>0 allow users to produce abundances that better match recombination-line results and mitigate the abundance discrepancy problem.
  • High-redshift metallicity estimates from JWST data can use the same relations as local data without additional redshift-dependent corrections.
  • Abundance studies spanning low to high redshift can adopt a single set of relations anchored to the widest empirical baseline now available.
  • The reported intrinsic dispersions provide realistic uncertainty estimates when these calibrations are applied to individual objects or large surveys.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Future large-scale surveys could adopt these relations as a default to produce uniform metallicity maps without needing separate high-redshift pipelines.
  • The result underscores that calibration samples must deliberately include the full range of observed ionization parameters rather than matching only the redshift of the target population.
  • The dual t^{2}=0 and t^{2}>0 presentations offer a practical route to test temperature-structure models against observed line ratios in both local and distant systems.
  • Extending the same homogeneous re-derivation approach to additional elements or to infrared lines could further widen the set of diagnostics available for metal-poor or dust-obscured galaxies.

Load-bearing premise

The homogeneous re-derivation of abundances from the 201 literature sources using current atomic data yields unbiased true metallicities and the 2392 spectra adequately represent all relevant ionization conditions without selection bias.

What would settle it

A new sample of high-redshift galaxies with both strong-line spectra and independent direct-temperature metallicities would falsify the claim if the DESIRED calibrations systematically deviate from the direct values by more than the reported 0.35 dex scatter.

Figures

Figures reproduced from arXiv: 2604.16273 by A. Z. Lugo-Aranda, C. Esteban, C. Morisset, E. Reyes-Rodr\'iguez, F. F. Rosales-Ortega, I. A. Zinchenko, J. C. L\'opez-Guti\'errez, J. E. M\'endez-Delgado, J. Garc\'ia-Rojas, J. M. V\'ilchez, J. U. Guerrero-Gonz\'alez, K. Kreckel, K. Z. Arellano-C\'ordova, L. E. Mart\'inez-Rivero, L. Toribio San Cipriano, M. Orte-Garc\'ia, O. Egorov, O. Esp\'indola-Camacho, S. F. S\'anchez.

Figure 1
Figure 1. Figure 1: Total oxygen abundance for the DESIRED calibration sample de￾rived under the homogeneous temperature structure assumption (𝑡 2 = 0, x-axis) and corrected for temperature inhomogeneities (𝑡 2 > 0, y-axis). The 𝑡 2 > 0 values are systematically higher, as expected from the effect of tem￾perature fluctuations on CEL-based abundance determinations. The dashed line denotes the one-to-one relation and error bars… view at source ↗
Figure 3
Figure 3. Figure 3: Impact of data quality on metallicity calibration in the 12+log(O/H) versus N2 ≡ log([N II] 𝜆6584/H𝛼) plane. The left panel includes all regions in our literature compilation with measurable N2 and detected auroral lines, allowing direct metallicity determinations without imposing quality or relative-error cuts; oxygen abundances correspond to the 𝑡 2 = 0 assumption and error bars denote 1𝜎 uncertainties. … view at source ↗
Figure 4
Figure 4. Figure 4: Gas-phase metallicity calibrations derived from the DESIRED calibration sample. The top row (from left to right) shows R2, R3, and R23; the middle row presents O32, 𝑅ˆ, and 𝑅𝑁 𝑒 š; and the bottom row displays Ne3O2, Ne3, and R2Ne3. The notation and index definitions for each diagnostic are given in [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Gas-phase metallicity calibrations derived from the DESIRED calibration sample. Top row (from left to right): N2, O3N2, and N2O2; middle row: S2, N2S2, and N2S2H𝛼; bottom row: S3, S3O3 and S23. All symbols, definitions, and plotting conventions as in [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: DESIRED gas-phase metallicity calibrations for the O3S2 and R3S2 diagnostics. All symbols, definitions, and plotting conventions are identical to those described in [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Gas-phase metallicity calibrations derived from the DESIRED calibration sample. Top row (left to right): Ar3, Ar3O3, O3HeI. Bottom row: Ar3N2, N3S2, and N3S3. The notation and ratio indices are defined in [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Strong-line diagnostics proposed in the literature found to be insensitive to metallicity. Shown is the comparison between the oxygen abundances derived from the 𝑇e-based method (𝑡 2 = 0) and the NeO3, S32, and Ar3S2 indices for the DESIRED calibration sample. For NeO3, the solid blue line shows the calibration proposed by Jones et al. (2015), while for S32 the solid red curve corresponds to the relation f… view at source ↗
Figure 9
Figure 9. Figure 9: Comparison between the oxygen abundances derived from the calibrations shown in [PITH_FULL_IMAGE:figures/full_fig_p020_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of the DESIRED metallicity calibrations shown in [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Comparison of the DESIRED metallicity calibrations shown in [PITH_FULL_IMAGE:figures/full_fig_p024_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Comparison of the DESIRED metallicity calibrations shown in Figs. 6 and 7 for the sulphur- and argon-based indicators with previous determinations from the literature. All definitions and plotting conventions as in [PITH_FULL_IMAGE:figures/full_fig_p026_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Minimum rest-frame spectral coverage required to apply each of the strong-line metallicity calibrators considered in this work. Each bar spans the wavelength range between the shortest and longest emission lines involved in a given diagnostic. Calibrations sharing an identical observational requirements are grouped on a single row. in Appendix E), which directly reveal the metallicity regimes where each d… view at source ↗
read the original abstract

We present the most comprehensive set of empirical optical strong-line metallicity calibrations to date, based on the DEep Spectra of Ionised REgions Database (DESIRED), the largest compilation of HII regions and galaxies with direct electron-temperature determinations assembled to date. We construct a high-quality calibration sample of 2392 spectra$-$1029 extragalactic HII regions, 1296 local star-forming galaxies, and 67 high-redshift ($z > 2$) galaxies$-$drawn from 201 independent literature references and spanning $12+\log({\rm O/H}) \in [6.79, 9.07]$. Physical conditions and chemical abundances are derived homogeneously using up-to-date atomic data. We derive 27 strong-line calibrations covering oxygen-, nitrogen-, sulphur-, argon-, and neon-based line ratios, including 4 previously uncalibrated diagnostics, with reported validity ranges and intrinsic dispersions (typically $\sim0.15-0.35$ dex). For the first time in a systematic calibration framework, all relations are presented for both the homogeneous temperature case ($t^2 = 0$) and a scenario including temperature inhomogeneities ($t^2 > 0$), thereby reconciling abundances from recombination lines (RLs) and collisionally excited lines (CELs) and directly tackling the abundance discrepancy problem. A comparison with previous calibrations shows that the DESIRED relations span the broadest validity intervals while remaining anchored to the empirical data. Crucially, recently proposed JWST-based high-redshift calibrations are consistent with our relations within the intrinsic scatter, demonstrating that the diverse composition of the DESIRED sample naturally encompasses the ionisation conditions found at high redshift. These results indicate that sample diversity, rather than redshift-specific recalibration, is key to reliable abundance determinations across cosmic time.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript compiles the DESIRED database of 2392 spectra (1029 extragalactic HII regions, 1296 local star-forming galaxies, 67 high-z galaxies) drawn from 201 literature sources with direct Te-based metallicities spanning 12+log(O/H) from 6.79 to 9.07. Physical conditions and abundances are re-derived homogeneously with updated atomic data. It presents 27 new empirical strong-line calibrations (including 4 previously uncalibrated) for O, N, S, Ar, and Ne diagnostics, each with validity ranges and intrinsic dispersions (~0.15-0.35 dex), provided for both t²=0 and t²>0 cases. The work shows consistency between these relations and recent JWST high-redshift calibrations, concluding that sample diversity alone suffices for reliable abundances across cosmic time without redshift-specific recalibrations.

Significance. If the homogeneous re-derivation proves robust against selection effects and the sample spans the relevant ionization-parameter space, this supplies the broadest empirical calibration set to date and directly addresses the CEL-RL abundance discrepancy via dual t² treatments. The JWST consistency check, if substantiated, would support using a single local-anchored framework for high-z work, reducing the need for separate high-redshift recalibrations and aiding standardized metallicity measurements from ground- and space-based spectroscopy.

major comments (3)
  1. [Sample construction and high-redshift subset] The claim that sample diversity renders redshift-specific recalibration unnecessary rests on the 2392 spectra (particularly the 67 high-z objects) adequately sampling all relevant ionization conditions. Because the parent literature selection requires detectable auroral lines (e.g., [O III] λ4363), the distribution of ionization parameters and N/O ratios may still be biased toward lower-excitation or higher-metallicity regimes. A direct comparison of the [O III]/[O II] or [S III]/[S II] distributions between the DESIRED subsets and JWST samples is required to confirm that the high-z tail lies within the calibrated domain.
  2. [Calibration procedure] The 27 calibrations are presented with reported intrinsic dispersions, but the fitting methodology (functional form, error treatment in both axes, outlier rejection criteria, and how the 2392 spectra were partitioned for training/validation) is not fully specified. Without these details it is impossible to assess whether the quoted dispersions are realistic or whether the relations are over-fit to the particular literature compilation.
  3. [Comparison with JWST results] The consistency statement with JWST calibrations is central to the no-recalibration conclusion. The manuscript must identify the exact JWST relations compared, the metallicity overlap, and whether the agreement holds when restricted to the 67 high-z DESIRED objects or only when the full local sample is used.
minor comments (2)
  1. [Abstract] The abstract states that four diagnostics are previously uncalibrated; naming them explicitly would aid readers.
  2. [Tables and figures] Ensure all line-ratio definitions (e.g., R23, N2, O3N2) are given with the exact wavelength combinations and that the validity ranges are tabulated alongside each calibration.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed report, which has helped us improve the clarity and robustness of the manuscript. We address each major comment below and have revised the paper accordingly to incorporate additional details, comparisons, and methodological descriptions.

read point-by-point responses
  1. Referee: [Sample construction and high-redshift subset] The claim that sample diversity renders redshift-specific recalibration unnecessary rests on the 2392 spectra (particularly the 67 high-z objects) adequately sampling all relevant ionization conditions. Because the parent literature selection requires detectable auroral lines (e.g., [O III] λ4363), the distribution of ionization parameters and N/O ratios may still be biased toward lower-excitation or higher-metallicity regimes. A direct comparison of the [O III]/[O II] or [S III]/[S II] distributions between the DESIRED subsets and JWST samples is required to confirm that the high-z tail lies within the calibrated domain.

    Authors: We agree that selection effects from requiring auroral-line detections merit explicit verification. While the 67 high-z spectra are drawn from the same literature compilation as the local sample, we have added a new Appendix A containing cumulative distribution functions and violin plots of the [O III]λ5007/[O II]λ3727 and [S III]λ9069,9532/[S II]λλ6717,6731 ratios for (i) the full local DESIRED set, (ii) the 67 high-z DESIRED objects, and (iii) published JWST high-redshift samples (Curti et al. 2023; Sanders et al. 2023). These distributions overlap substantially, with the high-z DESIRED tail extending to log([O III]/[O II]) > 1.0, confirming that the calibrated domain encompasses the ionization-parameter range probed by JWST. We have also added a brief discussion of N/O trends. This new material directly supports the claim that sample diversity, rather than redshift-specific recalibration, is sufficient. revision: yes

  2. Referee: [Calibration procedure] The 27 calibrations are presented with reported intrinsic dispersions, but the fitting methodology (functional form, error treatment in both axes, outlier rejection criteria, and how the 2392 spectra were partitioned for training/validation) is not fully specified. Without these details it is impossible to assess whether the quoted dispersions are realistic or whether the relations are over-fit to the particular literature compilation.

    Authors: We acknowledge that the original text lacked sufficient methodological transparency. In the revised manuscript we have inserted a new subsection (4.1) that specifies: (1) the adopted functional forms (linear or quadratic in the logarithm of the strong-line ratio, selected by BIC); (2) orthogonal distance regression (ODR) to treat uncertainties in both axes; (3) iterative 3σ outlier rejection with a maximum of two iterations; and (4) an 80/20 training/validation split, with the quoted intrinsic dispersions being the rms residual on the held-out validation set. We have also deposited the fitting scripts and the exact partition indices in a public GitHub repository linked in the paper. These additions allow readers to reproduce and evaluate the robustness of the reported dispersions. revision: yes

  3. Referee: [Comparison with JWST results] The consistency statement with JWST calibrations is central to the no-recalibration conclusion. The manuscript must identify the exact JWST relations compared, the metallicity overlap, and whether the agreement holds when restricted to the 67 high-z DESIRED objects or only when the full local sample is used.

    Authors: We have clarified the comparison in the revised Section 5.2 and Figure 8. The JWST relations explicitly compared are those of Curti et al. (2023) for R23, O32, and N2; Sanders et al. (2023) for S23 and Ar3O3; and Nakajima et al. (2023) for the Ne3O2 diagnostic. The metallicity overlap is 7.8 < 12+log(O/H) < 8.7. A new panel in Figure 8 shows the residuals of the 67 high-z DESIRED points alone; they lie within the same 0.20–0.30 dex scatter as the local sample, with no systematic offset. While the smaller high-z subsample yields larger formal uncertainties, the agreement is not driven solely by the local data. We have added this quantitative statement to the text. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical fits from external compilation

full rationale

The derivation proceeds by compiling 2392 spectra from 201 independent literature sources, re-deriving abundances and physical conditions homogeneously with updated atomic data, then fitting 27 strong-line relations (including t^2=0 and t^2>0 cases) to those direct metallicities versus observed line ratios. The central claim that sample diversity suffices without redshift-specific recalibration is checked by direct consistency with external JWST-based calibrations. No self-definitional steps, no fitted inputs relabeled as predictions, no load-bearing self-citations, and no ansatz smuggled via prior work appear in the provided text. The chain is anchored to external data and remains falsifiable against independent benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the accuracy of the compiled sample and the empirical fitting process; free parameters are the coefficients of the 27 relations.

free parameters (1)
  • coefficients for each of the 27 strong-line relations
    Each calibration is an empirical fit whose parameters are determined from the DESIRED sample data.
axioms (2)
  • domain assumption Homogeneous derivation of abundances and physical conditions using up-to-date atomic data from literature sources is accurate and unbiased
    Stated as the basis for constructing the high-quality calibration sample.
  • domain assumption The 2392 spectra adequately represent the full range of physical conditions in HII regions and galaxies without significant selection bias
    Required for the calibrations to have broad validity including at high redshift.

pith-pipeline@v0.9.0 · 5800 in / 1378 out tokens · 56288 ms · 2026-05-10T07:34:58.955880+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

10 extracted references · 10 canonical work pages

  1. [1]

    M., Keenan F

    Aggarwal K. M., Keenan F. P., 1999, ApJS, 123, 311 Alloin D., Collin-Souffrin S., Joly M., Vigroux L., 1979, A&A, 78, 200 Amayo A., Delgado-Inglada G., Stasińska G., 2021, MNRAS, 505, 2361 Andrews B. H., Martini P., 2013, ApJ, 765, 140 Annibali F., Tosi M., Pasquali A., Aloisi A., Mignoli M., Romano D., 2015, AJ, 150, 143 Annibali F., et al., 2017, ApJ, 8...

  2. [2]

    APPENDIX C: CALIBRATION FITTING METHODOLOGY This section describes the methodology employed to derive the strong-line calibrations presented in Sec. 3.1. Given a dataset of𝑁calibration spectra with𝑇 𝑒-based oxygen abundances (𝑦= 12+log(O/H)) and emission-line ratios (𝑥), two natural regression directionsexist.InDirectionA(direct),themetallicityismodelleda...

  3. [3]

    Among all models withinΔBIC<10of the best-fit value, the one with the lowest𝜎 fit at the lowest polynomial degree is adopted

    is used to penalise model complexity: BIC=𝑁ln RSS 𝑁 +𝑘ln𝑁,(C2) where RSS is the residual sum of squares and𝑘is the number of freeparameters.AdifferenceΔBIC>10istakenasstrongevidence against the higher-BIC model (Kass & Raftery 1995). Among all models withinΔBIC<10of the best-fit value, the one with the lowest𝜎 fit at the lowest polynomial degree is adopte...

  4. [4]

    Pilyugin & Thuan 2005 0.5 0.0 0.5 1.0 R23 | log[([OII] 3727+[OIII] 4959,5007)/H ] P = 0.1 P = 0.3 P = 0.5 P = 0.9 P = 0.5 P = 0.8 P = 0.9 P = 1.0 R23 0.02 0.1 0.5 1.0 2.0 Solar metallicity Figure C1.Comparison of the DESIRED metallicity calibrations for R3 and R23 shown in Fig. 4 with the family of relations O/H=𝑓(𝑅3, 𝑃)(left) and O/H = O/H=𝑓(𝑅23, 𝑃)(righ...

  5. [5]

    E1 to E5 present the calibration–𝑇e comparisons for the re- maining diagnostics discussed in Sec

    APPENDIX E: COMPARISON OF CALIBRATIONS AND 𝑇E-BASED METALLICITIES Figs. E1 to E5 present the calibration–𝑇e comparisons for the re- maining diagnostics discussed in Sec. 4.1, for both the𝑡2 =0and 𝑡2 >0cases. The figures follow the same order adopted in Sec. 3.1: Fig. E1 corresponds to the nitrogen- and sulphur-based calibrations, andFig.E2totheargon-based...

  6. [6]

    5 and the direct (𝑇e-based,𝑡 2 =0) metallicities of the calibration sample

    S23 0.5 0.0 0.5 = 0.23 20 40 60 80 0.02 0.1 0.5 1.0 2.0 Solar metallicity Figure E1.Comparison between the oxygen abundances inferred from the calibrations presented in Fig. 5 and the direct (𝑇e-based,𝑡 2 =0) metallicities of the calibration sample. The panels correspond to the N2, O3N2, N2O2, S2, N2S2, N2S2H𝛼, S3, S3O3, and S23 diagnostics, arranged from...

  7. [7]

    7 and 6, and the direct (𝑇e-based,𝑡 2 =0) metallicities of the calibration sample

    R3S2 0.5 0.0 0.5 = 0.17 (upper) 25 50 75 100 125 150 0.5 0.0 = 0.24 (lower) 10 20 30 40 50 0.02 0.1 0.5 1.0 2.0 Solar metallicity Figure E2.Comparison between the oxygen abundances inferred from the calibrations presented in Figs. 7 and 6, and the direct (𝑇e-based,𝑡 2 =0) metallicities of the calibration sample. The panels correspond to Ar3, Ar3O3, O3HeI,...

  8. [8]

    9, but for the calibrations and metallicities for the𝑡2 >0case

    R2Ne3 1 0 1 = 0.34 50 100 150 0.02 0.1 0.5 1.0 2.0 Solar metallicity Figure E3.Same as Fig. 9, but for the calibrations and metallicities for the𝑡2 >0case. The values of the standard deviation of the metallicity residuals relative to the𝑇e-based abundances,𝜎cal, shown in the inset histograms, are reported in Table 4 for the𝑡2 >0case. MNRAS000, 1–45 (2026)...

  9. [9]

    E1, but for the calibrations and metallicities for the𝑡2 >0case

    S23 1 0 = 0.27 20 40 60 80 100 0.02 0.1 0.5 1.0 2.0 Solar metallicity Figure E4.Same as Fig. E1, but for the calibrations and metallicities for the𝑡2 >0case. MNRAS000, 1–45 (2026) The DESIRED strong-line calibrations I43 6.5 7.0 7.5 8.0 8.5 9.0 9.5 12+log(O/H) | Ar3 calibration (t2 >

  10. [10]

    E2, but for the calibrations and metallicities for the𝑡2 >0case

    R3S2 0.5 0.0 0.5 = 0.19 (upper) 25 50 75 100 125 1.0 0.5 0.0 0.5 = 0.28 (lower) 10 20 30 40 50 60 0.02 0.1 0.5 1.0 2.0 Solar metallicity Figure E5.Same as Fig. E2, but for the calibrations and metallicities for the𝑡2 >0case. MNRAS000, 1–45 (2026) 44F. F. Rosales-Ortega et al. Table F1.The DESIRED calibration catalogue (example). For each spectrum, the tab...