A Unified HI Rotation Curve Corpus for Computational Astrophysics: 438 Galaxies from SPARC, THINGS, LITTLE THINGS, and WALLABY DR2
Pith reviewed 2026-05-10 13:12 UTC · model grok-4.3
The pith
A single verified corpus unifies 8963 HI rotation curve measurements from 438 galaxies drawn from four major surveys.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We present a unified corpus of 8,963 spatially resolved HI rotation curve measurements across 423 galaxies (438 total catalog entries including 15 metadata-only THINGS galaxies), drawn from four major surveys: SPARC (175), THINGS (34), LITTLE THINGS (26), and WALLABY DR2 (203). The corpus is distributed as a single structured JSON file with nested per-ring kinematic data, survey metadata, column definitions, and data-quality annotations, accompanied by a 438-row flat CSV for catalog-level filtering. A two-tier quality system distinguishes hand-curated rotation curves with per-point uncertainties (Tier 1) from automated pipeline products (Tier 2).
What carries the argument
The unified JSON file containing nested per-ring kinematic data, survey metadata, and a two-tier quality flag system that separates hand-curated curves from automated products.
If this is right
- Multi-survey analyses of galaxy dynamics become possible without repeated format conversion or cross-checking steps.
- The three worked Python examples demonstrate that standard tasks such as rotation-curve plotting and baryonic mass modeling require fewer than fifteen lines of code.
- The RAG-friendly structure allows large-language-model pipelines to retrieve specific rotation-curve segments or quality-filtered subsets directly.
- The public CC-BY license and Zenodo DOI remove legal and access barriers for reuse in computational astrophysics studies.
Where Pith is reading between the lines
- The corpus could serve as a ready-made benchmark set for testing new dark-matter or modified-gravity models across a wide range of galaxy masses and environments.
- Its design may encourage similar unification efforts for other wavelength regimes or other galaxy properties such as stellar kinematics or gas surface-density profiles.
- Cross-survey statistics on rotation-curve shapes or dark-matter halo parameters become feasible at corpus scale rather than galaxy-by-galaxy.
Load-bearing premise
That the kinematic parameters extracted from the four independent surveys can be accurately verified against their primary tables and combined into a consistent unified format without introducing significant systematic biases or loss of information.
What would settle it
A direct comparison in which velocity or radius values in the released corpus differ from the scanned primary tables by amounts larger than the quoted uncertainties for a substantial fraction of galaxies.
Figures
read the original abstract
We present a unified corpus of 8,963 spatially resolved HI rotation curve measurements across 423 galaxies (438 total catalog entries including 15 metadata-only THINGS galaxies), drawn from four major surveys: SPARC (175), THINGS (34), LITTLE THINGS (26), and WALLABY DR2 (203). The corpus is distributed as a single structured JSON file with nested per-ring kinematic data, survey metadata, column definitions, and data-quality annotations, accompanied by a 438-row flat CSV for catalog-level filtering. All radii are in kiloparsecs, all velocities in km/s. Kinematic parameters have been verified against scanned primary tables. A two-tier quality system distinguishes hand-curated rotation curves with per-point uncertainties (Tier 1) from automated pipeline products (Tier 2). The corpus was designed for both traditional numerical analysis and Large Language Model retrieval-augmented generation (RAG) pipelines. Three worked examples demonstrate single-galaxy rotation curve plotting, multi-component baryonic analysis, and corpus-level parameter-space exploration, each requiring fewer than 15 lines of Python. The corpus is publicly available at Zenodo (DOI: 10.5281/zenodo.19563417) under CC BY 4.0.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a unified corpus of 8,963 spatially resolved HI rotation curve measurements from 423 galaxies (438 catalog entries total, including 15 metadata-only THINGS entries), compiled from SPARC (175), THINGS (34), LITTLE THINGS (26), and WALLABY DR2 (203) surveys. The data are released as a structured JSON file with nested per-ring kinematics, metadata, and quality annotations, plus a flat CSV catalog; all values are in kpc and km/s. Kinematic parameters are stated to have been verified against scanned primary tables. A two-tier quality system is applied (Tier 1: hand-curated with uncertainties; Tier 2: automated pipeline products). The corpus is designed for numerical analysis and LLM RAG use, with three short Python examples provided, and is publicly available on Zenodo under CC BY 4.0.
Significance. If the unification and verification steps are robustly documented and free of systematic transcription or format-conversion errors, the corpus would provide a valuable standardized resource for computational astrophysics. Strengths include the public Zenodo release, dual JSON/CSV formats suitable for both traditional analysis and RAG pipelines, explicit quality tiering, and the inclusion of concise worked examples that lower the barrier to entry for users.
major comments (1)
- [Abstract] Abstract: The central claim that 'Kinematic parameters have been verified against scanned primary tables' is load-bearing for the assertion of a faithful unified corpus, yet the manuscript provides no description of the verification protocol, discrepancy-resolution procedures, quantitative agreement metrics (e.g., fraction of points altered or maximum residuals), or how conflicts between scanned values and pipeline outputs were adjudicated. This omission is particularly material for the 203 WALLABY DR2 galaxies (Tier 2, automated), where any systematic offset in ring selection or velocity extraction would propagate into the 8,963-row dataset without being flagged beyond the tier label.
minor comments (2)
- [Data release description] The distinction between 423 galaxies and 438 catalog entries (due to 15 metadata-only THINGS entries) is clearly stated in the abstract but should be reiterated in the data-release section to avoid potential user confusion when filtering the CSV catalog.
- [Examples section] The manuscript states that the three worked examples require fewer than 15 lines of Python; including the actual code snippets (or links to a companion notebook) in the text would improve immediate reproducibility without increasing length substantially.
Simulated Author's Rebuttal
We thank the referee for their careful reading of the manuscript and for highlighting the importance of transparent documentation for the verification process. We agree that additional details are needed to substantiate the claim in the abstract and will revise the manuscript to address this.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that 'Kinematic parameters have been verified against scanned primary tables' is load-bearing for the assertion of a faithful unified corpus, yet the manuscript provides no description of the verification protocol, discrepancy-resolution procedures, quantitative agreement metrics (e.g., fraction of points altered or maximum residuals), or how conflicts between scanned values and pipeline outputs were adjudicated. This omission is particularly material for the 203 WALLABY DR2 galaxies (Tier 2, automated), where any systematic offset in ring selection or velocity extraction would propagate into the 8,963-row dataset without being flagged beyond the tier label.
Authors: We acknowledge that the current manuscript does not provide a detailed description of the verification protocol, which is a valid point. In the revised version, we will add a new subsection titled 'Verification Against Primary Tables' (placed after the survey compilation description) that explicitly outlines: the manual scanning and digitization workflow applied to the original published tables for all surveys; the quantitative metrics computed (including per-survey fractions of adjusted points and maximum absolute residuals); the discrepancy adjudication rules (prioritizing scanned primary values, with automated pipeline outputs used only as fallback for Tier 2 entries); and survey-specific statistics, with dedicated discussion of the WALLABY DR2 sample. These additions will be supported by a supplementary table summarizing agreement levels. This revision strengthens the manuscript without changing the released dataset. revision: yes
Circularity Check
No circularity: pure data aggregation with external verification
full rationale
The paper compiles and formats existing HI rotation curve data from four independent surveys (SPARC, THINGS, LITTLE THINGS, WALLABY DR2) into a unified JSON/CSV corpus. No derivations, equations, model fits, predictions, or first-principles claims are present. The central claim rests on verification against scanned primary tables from those external surveys, which is an independent check rather than a self-referential reduction. No self-citations, ansatzes, or renamings of results occur in any load-bearing step. This is a standard data-release manuscript with no internal circular reasoning.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
A Unified H i Rotation Curve Database for 129 Local Volume Dwarf and Irregular Galaxies
A unified HI rotation curve database for 129 Local Volume dwarf and irregular galaxies with standardized parameters, quality tiers, and machine-readable formats.
Reference graph
Works this paper leans on
-
[1]
J.F. Navarro, C.S. Frenk, S.D.M. White, The structure of cold dark matter halos, ApJ 462 (1996) 563
work page 1996
-
[2]
M. Milgrom, A modification of the Newtonian dynamics as a possible al- ternative to the hidden mass hypothesis, ApJ 270 (1983) 365
work page 1983
-
[3]
S.S.McGaugh, Themassdiscrepancy–accelerationrelation, ApJ609(2004) 652
work page 2004
-
[4]
D.C. Flynn, J. Cannaliato, A new empirical fit to galaxy rota- tion curves, Frontiers in Astronomy and Space Sciences 12 (2025). doi:10.3389/fspas.2025.1680387
-
[5]
F. Lelli, S.S. McGaugh, J.M. Schombert, SPARC: Mass models for 175 disk galaxies with Spitzer photometry and accurate rotation curves, AJ 152 (2016) 157
work page 2016
- [6]
-
[7]
W.J.G.deBlok, F.Walter, E.Brinks, etal., High-resolutionrotationcurves and galaxy mass models from THINGS, AJ 136 (2008) 2648
work page 2008
-
[8]
S.-H. Oh, D.A. Hunter, E. Brinks, et al., High-resolution mass models of dwarf galaxies from LITTLE THINGS, AJ 149 (2015) 180
work page 2015
-
[9]
T. Westmeier, N. Deg, K. Spekkens, et al., WALLABY: an SKA Pathfinder HI survey, PASA 39 (2022) e058
work page 2022
-
[10]
N. Deg, K. Spekkens, T. Westmeier, et al., WALLABY kinematic mod- elling, PASA 39 (2022) e059
work page 2022
-
[11]
C. Murugeshan, N. Deg, T. Westmeier, et al., WALLABY pilot survey: public data release of∼1800 HI sources and high-resolution cut-outs from Pilot Survey Phase 2, PASA 41 (2024) e088
work page 2024
-
[12]
E.M. Di Teodoro, F. Fraternali, 3D Barolo: a new 3D algorithm to derive rotation curves of galaxies, MNRAS 451 (2015) 3021
work page 2015
-
[13]
P. Kamphuis, G.I.G. Józsa, S.-H. Oh, et al., Automated kinematic mod- elling of warped galaxies, MNRAS 452 (2015) 3139. 18
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.