A Unified HI Rotation Curve Corpus for Computational Astrophysics: 438 Galaxies from SPARC, THINGS, LITTLE THINGS, and WALLABY DR2

David C. Flynn

arxiv: 2604.13489 · v1 · submitted 2026-04-15 · 🌌 astro-ph.GA · astro-ph.IM

A Unified HI Rotation Curve Corpus for Computational Astrophysics: 438 Galaxies from SPARC, THINGS, LITTLE THINGS, and WALLABY DR2

David C. Flynn This is my paper

Pith reviewed 2026-05-10 13:12 UTC · model grok-4.3

classification 🌌 astro-ph.GA astro-ph.IM

keywords HI rotation curvesgalaxy kinematicsSPARCTHINGSLITTLE THINGSWALLABYdata corpuscomputational astrophysics

0 comments

The pith

A single verified corpus unifies 8963 HI rotation curve measurements from 438 galaxies drawn from four major surveys.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper assembles spatially resolved HI rotation curves from SPARC, THINGS, LITTLE THINGS, and WALLABY DR2 into one consistent dataset. All radii are expressed in kiloparsecs and velocities in km/s, with kinematic parameters checked against the original published tables. The release supplies both a nested JSON file containing per-ring data and quality annotations and a flat CSV catalog for quick selection. The structure supports ordinary numerical work as well as retrieval-augmented generation pipelines for large language models. Three short Python examples show how to plot individual curves, perform baryonic decompositions, and explore the full parameter space.

Core claim

What carries the argument

The unified JSON file containing nested per-ring kinematic data, survey metadata, and a two-tier quality flag system that separates hand-curated curves from automated products.

If this is right

Multi-survey analyses of galaxy dynamics become possible without repeated format conversion or cross-checking steps.
The three worked Python examples demonstrate that standard tasks such as rotation-curve plotting and baryonic mass modeling require fewer than fifteen lines of code.
The RAG-friendly structure allows large-language-model pipelines to retrieve specific rotation-curve segments or quality-filtered subsets directly.
The public CC-BY license and Zenodo DOI remove legal and access barriers for reuse in computational astrophysics studies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The corpus could serve as a ready-made benchmark set for testing new dark-matter or modified-gravity models across a wide range of galaxy masses and environments.
Its design may encourage similar unification efforts for other wavelength regimes or other galaxy properties such as stellar kinematics or gas surface-density profiles.
Cross-survey statistics on rotation-curve shapes or dark-matter halo parameters become feasible at corpus scale rather than galaxy-by-galaxy.

Load-bearing premise

That the kinematic parameters extracted from the four independent surveys can be accurately verified against their primary tables and combined into a consistent unified format without introducing significant systematic biases or loss of information.

What would settle it

A direct comparison in which velocity or radius values in the released corpus differ from the scanned primary tables by amounts larger than the quoted uncertainties for a substantial fraction of galaxies.

Figures

Figures reproduced from arXiv: 2604.13489 by David C. Flynn.

**Figure 2.** Figure 2: WALLABY J165901−601241 (Tier 2) loaded from corpus JSON. The 50 km/s beam-smearing caution zone is shaded. Metadata (D = 15.2 Mpc, inc = 50.7 ◦, 37 rings) is extracted directly from the JSON. 6.3. Example 3: Corpus-level parameter-space exploration [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 3.** Figure 3: Corpus population overview. (a) Peak rotation velocity distribution across all four [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗

read the original abstract

We present a unified corpus of 8,963 spatially resolved HI rotation curve measurements across 423 galaxies (438 total catalog entries including 15 metadata-only THINGS galaxies), drawn from four major surveys: SPARC (175), THINGS (34), LITTLE THINGS (26), and WALLABY DR2 (203). The corpus is distributed as a single structured JSON file with nested per-ring kinematic data, survey metadata, column definitions, and data-quality annotations, accompanied by a 438-row flat CSV for catalog-level filtering. All radii are in kiloparsecs, all velocities in km/s. Kinematic parameters have been verified against scanned primary tables. A two-tier quality system distinguishes hand-curated rotation curves with per-point uncertainties (Tier 1) from automated pipeline products (Tier 2). The corpus was designed for both traditional numerical analysis and Large Language Model retrieval-augmented generation (RAG) pipelines. Three worked examples demonstrate single-galaxy rotation curve plotting, multi-component baryonic analysis, and corpus-level parameter-space exploration, each requiring fewer than 15 lines of Python. The corpus is publicly available at Zenodo (DOI: 10.5281/zenodo.19563417) under CC BY 4.0.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper repackages existing HI rotation curve data from four surveys into one JSON corpus with quality tiers and examples, which is handy for standardization but thin on how the unification was actually checked.

read the letter

The main thing here is a data aggregation effort that puts 8,963 rotation curve points from SPARC, THINGS, LITTLE THINGS, and WALLABY DR2 into a single structured JSON file, plus a flat CSV and a few short Python scripts. All radii are in kpc and velocities in km/s, with nested per-ring data, survey metadata, and a two-tier quality flag that separates hand-curated curves from automated ones. The files are on Zenodo under CC BY 4.0, and the examples cover plotting a single curve, baryonic decomposition, and quick corpus-wide exploration, each in under 15 lines of code. The abstract notes that parameters were verified against scanned primary tables, and the work targets both traditional analysis and RAG-style LLM use. That is the actual new product: one consistent, downloadable package instead of hunting down separate survey releases in different formats. It does well at the convenience angle. Anyone running multi-galaxy models or needing uniform input for computational work saves time on format wrangling and unit conversion. Adding WALLABY DR2 brings in more recent data, and the tier system gives users a quick way to filter for higher-confidence points. The soft spot is the verification step. The abstract asserts the check happened, but there is no description of the protocol, how many points differed from the originals, what was done about conflicts, or any quantitative residuals. That gap matters most for the 203 WALLABY entries labeled Tier 2, where automated ring selection or velocity extraction could introduce systematic offsets that the two-tier label alone does not catch. Without those details or sample comparison tables, users cannot fully assess fidelity. This paper is for computational astrophysicists or data-focused researchers who want a ready-made, multi-survey rotation curve set for modeling or machine-learning pipelines. It is not advancing new physics or fitting methods, just access and consistency. It deserves peer review so that referees can examine the compilation choices and ask for the missing verification documentation. I would send it out rather than desk reject.

Referee Report

1 major / 2 minor

Summary. The paper presents a unified corpus of 8,963 spatially resolved HI rotation curve measurements from 423 galaxies (438 catalog entries total, including 15 metadata-only THINGS entries), compiled from SPARC (175), THINGS (34), LITTLE THINGS (26), and WALLABY DR2 (203) surveys. The data are released as a structured JSON file with nested per-ring kinematics, metadata, and quality annotations, plus a flat CSV catalog; all values are in kpc and km/s. Kinematic parameters are stated to have been verified against scanned primary tables. A two-tier quality system is applied (Tier 1: hand-curated with uncertainties; Tier 2: automated pipeline products). The corpus is designed for numerical analysis and LLM RAG use, with three short Python examples provided, and is publicly available on Zenodo under CC BY 4.0.

Significance. If the unification and verification steps are robustly documented and free of systematic transcription or format-conversion errors, the corpus would provide a valuable standardized resource for computational astrophysics. Strengths include the public Zenodo release, dual JSON/CSV formats suitable for both traditional analysis and RAG pipelines, explicit quality tiering, and the inclusion of concise worked examples that lower the barrier to entry for users.

major comments (1)

[Abstract] Abstract: The central claim that 'Kinematic parameters have been verified against scanned primary tables' is load-bearing for the assertion of a faithful unified corpus, yet the manuscript provides no description of the verification protocol, discrepancy-resolution procedures, quantitative agreement metrics (e.g., fraction of points altered or maximum residuals), or how conflicts between scanned values and pipeline outputs were adjudicated. This omission is particularly material for the 203 WALLABY DR2 galaxies (Tier 2, automated), where any systematic offset in ring selection or velocity extraction would propagate into the 8,963-row dataset without being flagged beyond the tier label.

minor comments (2)

[Data release description] The distinction between 423 galaxies and 438 catalog entries (due to 15 metadata-only THINGS entries) is clearly stated in the abstract but should be reiterated in the data-release section to avoid potential user confusion when filtering the CSV catalog.
[Examples section] The manuscript states that the three worked examples require fewer than 15 lines of Python; including the actual code snippets (or links to a companion notebook) in the text would improve immediate reproducibility without increasing length substantially.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for highlighting the importance of transparent documentation for the verification process. We agree that additional details are needed to substantiate the claim in the abstract and will revise the manuscript to address this.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'Kinematic parameters have been verified against scanned primary tables' is load-bearing for the assertion of a faithful unified corpus, yet the manuscript provides no description of the verification protocol, discrepancy-resolution procedures, quantitative agreement metrics (e.g., fraction of points altered or maximum residuals), or how conflicts between scanned values and pipeline outputs were adjudicated. This omission is particularly material for the 203 WALLABY DR2 galaxies (Tier 2, automated), where any systematic offset in ring selection or velocity extraction would propagate into the 8,963-row dataset without being flagged beyond the tier label.

Authors: We acknowledge that the current manuscript does not provide a detailed description of the verification protocol, which is a valid point. In the revised version, we will add a new subsection titled 'Verification Against Primary Tables' (placed after the survey compilation description) that explicitly outlines: the manual scanning and digitization workflow applied to the original published tables for all surveys; the quantitative metrics computed (including per-survey fractions of adjusted points and maximum absolute residuals); the discrepancy adjudication rules (prioritizing scanned primary values, with automated pipeline outputs used only as fallback for Tier 2 entries); and survey-specific statistics, with dedicated discussion of the WALLABY DR2 sample. These additions will be supported by a supplementary table summarizing agreement levels. This revision strengthens the manuscript without changing the released dataset. revision: yes

Circularity Check

0 steps flagged

No circularity: pure data aggregation with external verification

full rationale

The paper compiles and formats existing HI rotation curve data from four independent surveys (SPARC, THINGS, LITTLE THINGS, WALLABY DR2) into a unified JSON/CSV corpus. No derivations, equations, model fits, predictions, or first-principles claims are present. The central claim rests on verification against scanned primary tables from those external surveys, which is an independent check rather than a self-referential reduction. No self-citations, ansatzes, or renamings of results occur in any load-bearing step. This is a standard data-release manuscript with no internal circular reasoning.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a data curation paper that aggregates existing observational measurements; it introduces no new physical models, free parameters, axioms, or invented entities beyond the standard assumptions of HI rotation curve observations from the cited surveys.

pith-pipeline@v0.9.0 · 5538 in / 1259 out tokens · 64789 ms · 2026-05-10T13:12:04.277396+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Unified H i Rotation Curve Database for 129 Local Volume Dwarf and Irregular Galaxies
astro-ph.GA 2026-05 accept novelty 5.0

A unified HI rotation curve database for 129 Local Volume dwarf and irregular galaxies with standardized parameters, quality tiers, and machine-readable formats.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages · cited by 1 Pith paper

[1]

Navarro, C.S

J.F. Navarro, C.S. Frenk, S.D.M. White, The structure of cold dark matter halos, ApJ 462 (1996) 563

work page 1996
[2]

Milgrom, A modification of the Newtonian dynamics as a possible al- ternative to the hidden mass hypothesis, ApJ 270 (1983) 365

M. Milgrom, A modification of the Newtonian dynamics as a possible al- ternative to the hidden mass hypothesis, ApJ 270 (1983) 365

work page 1983
[3]

S.S.McGaugh, Themassdiscrepancy–accelerationrelation, ApJ609(2004) 652

work page 2004
[4]

C., & Cannaliato, J

D.C. Flynn, J. Cannaliato, A new empirical fit to galaxy rota- tion curves, Frontiers in Astronomy and Space Sciences 12 (2025). doi:10.3389/fspas.2025.1680387

work page doi:10.3389/fspas.2025.1680387 2025
[5]

Lelli, S.S

F. Lelli, S.S. McGaugh, J.M. Schombert, SPARC: Mass models for 175 disk galaxies with Spitzer photometry and accurate rotation curves, AJ 152 (2016) 157

work page 2016
[6]

Walter, E

F. Walter, E. Brinks, W.J.G. de Blok, et al., THINGS: The HI Nearby Galaxy Survey, AJ 136 (2008) 2563. 17

work page 2008
[7]

W.J.G.deBlok, F.Walter, E.Brinks, etal., High-resolutionrotationcurves and galaxy mass models from THINGS, AJ 136 (2008) 2648

work page 2008
[8]

S.-H. Oh, D.A. Hunter, E. Brinks, et al., High-resolution mass models of dwarf galaxies from LITTLE THINGS, AJ 149 (2015) 180

work page 2015
[9]

Westmeier, N

T. Westmeier, N. Deg, K. Spekkens, et al., WALLABY: an SKA Pathfinder HI survey, PASA 39 (2022) e058

work page 2022
[10]

N. Deg, K. Spekkens, T. Westmeier, et al., WALLABY kinematic mod- elling, PASA 39 (2022) e059

work page 2022
[11]

Murugeshan, N

C. Murugeshan, N. Deg, T. Westmeier, et al., WALLABY pilot survey: public data release of∼1800 HI sources and high-resolution cut-outs from Pilot Survey Phase 2, PASA 41 (2024) e088

work page 2024
[12]

Di Teodoro, F

E.M. Di Teodoro, F. Fraternali, 3D Barolo: a new 3D algorithm to derive rotation curves of galaxies, MNRAS 451 (2015) 3021

work page 2015
[13]

Kamphuis, G.I.G

P. Kamphuis, G.I.G. Józsa, S.-H. Oh, et al., Automated kinematic mod- elling of warped galaxies, MNRAS 452 (2015) 3139. 18

work page 2015

[1] [1]

Navarro, C.S

J.F. Navarro, C.S. Frenk, S.D.M. White, The structure of cold dark matter halos, ApJ 462 (1996) 563

work page 1996

[2] [2]

Milgrom, A modification of the Newtonian dynamics as a possible al- ternative to the hidden mass hypothesis, ApJ 270 (1983) 365

M. Milgrom, A modification of the Newtonian dynamics as a possible al- ternative to the hidden mass hypothesis, ApJ 270 (1983) 365

work page 1983

[3] [3]

S.S.McGaugh, Themassdiscrepancy–accelerationrelation, ApJ609(2004) 652

work page 2004

[4] [4]

C., & Cannaliato, J

D.C. Flynn, J. Cannaliato, A new empirical fit to galaxy rota- tion curves, Frontiers in Astronomy and Space Sciences 12 (2025). doi:10.3389/fspas.2025.1680387

work page doi:10.3389/fspas.2025.1680387 2025

[5] [5]

Lelli, S.S

F. Lelli, S.S. McGaugh, J.M. Schombert, SPARC: Mass models for 175 disk galaxies with Spitzer photometry and accurate rotation curves, AJ 152 (2016) 157

work page 2016

[6] [6]

Walter, E

F. Walter, E. Brinks, W.J.G. de Blok, et al., THINGS: The HI Nearby Galaxy Survey, AJ 136 (2008) 2563. 17

work page 2008

[7] [7]

W.J.G.deBlok, F.Walter, E.Brinks, etal., High-resolutionrotationcurves and galaxy mass models from THINGS, AJ 136 (2008) 2648

work page 2008

[8] [8]

S.-H. Oh, D.A. Hunter, E. Brinks, et al., High-resolution mass models of dwarf galaxies from LITTLE THINGS, AJ 149 (2015) 180

work page 2015

[9] [9]

Westmeier, N

T. Westmeier, N. Deg, K. Spekkens, et al., WALLABY: an SKA Pathfinder HI survey, PASA 39 (2022) e058

work page 2022

[10] [10]

N. Deg, K. Spekkens, T. Westmeier, et al., WALLABY kinematic mod- elling, PASA 39 (2022) e059

work page 2022

[11] [11]

Murugeshan, N

C. Murugeshan, N. Deg, T. Westmeier, et al., WALLABY pilot survey: public data release of∼1800 HI sources and high-resolution cut-outs from Pilot Survey Phase 2, PASA 41 (2024) e088

work page 2024

[12] [12]

Di Teodoro, F

E.M. Di Teodoro, F. Fraternali, 3D Barolo: a new 3D algorithm to derive rotation curves of galaxies, MNRAS 451 (2015) 3021

work page 2015

[13] [13]

Kamphuis, G.I.G

P. Kamphuis, G.I.G. Józsa, S.-H. Oh, et al., Automated kinematic mod- elling of warped galaxies, MNRAS 452 (2015) 3139. 18

work page 2015