pith. sign in

arxiv: 2604.13489 · v1 · submitted 2026-04-15 · 🌌 astro-ph.GA · astro-ph.IM

A Unified HI Rotation Curve Corpus for Computational Astrophysics: 438 Galaxies from SPARC, THINGS, LITTLE THINGS, and WALLABY DR2

Pith reviewed 2026-05-10 13:12 UTC · model grok-4.3

classification 🌌 astro-ph.GA astro-ph.IM
keywords HI rotation curvesgalaxy kinematicsSPARCTHINGSLITTLE THINGSWALLABYdata corpuscomputational astrophysics
0
0 comments X

The pith

A single verified corpus unifies 8963 HI rotation curve measurements from 438 galaxies drawn from four major surveys.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper assembles spatially resolved HI rotation curves from SPARC, THINGS, LITTLE THINGS, and WALLABY DR2 into one consistent dataset. All radii are expressed in kiloparsecs and velocities in km/s, with kinematic parameters checked against the original published tables. The release supplies both a nested JSON file containing per-ring data and quality annotations and a flat CSV catalog for quick selection. The structure supports ordinary numerical work as well as retrieval-augmented generation pipelines for large language models. Three short Python examples show how to plot individual curves, perform baryonic decompositions, and explore the full parameter space.

Core claim

We present a unified corpus of 8,963 spatially resolved HI rotation curve measurements across 423 galaxies (438 total catalog entries including 15 metadata-only THINGS galaxies), drawn from four major surveys: SPARC (175), THINGS (34), LITTLE THINGS (26), and WALLABY DR2 (203). The corpus is distributed as a single structured JSON file with nested per-ring kinematic data, survey metadata, column definitions, and data-quality annotations, accompanied by a 438-row flat CSV for catalog-level filtering. A two-tier quality system distinguishes hand-curated rotation curves with per-point uncertainties (Tier 1) from automated pipeline products (Tier 2).

What carries the argument

The unified JSON file containing nested per-ring kinematic data, survey metadata, and a two-tier quality flag system that separates hand-curated curves from automated products.

If this is right

  • Multi-survey analyses of galaxy dynamics become possible without repeated format conversion or cross-checking steps.
  • The three worked Python examples demonstrate that standard tasks such as rotation-curve plotting and baryonic mass modeling require fewer than fifteen lines of code.
  • The RAG-friendly structure allows large-language-model pipelines to retrieve specific rotation-curve segments or quality-filtered subsets directly.
  • The public CC-BY license and Zenodo DOI remove legal and access barriers for reuse in computational astrophysics studies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The corpus could serve as a ready-made benchmark set for testing new dark-matter or modified-gravity models across a wide range of galaxy masses and environments.
  • Its design may encourage similar unification efforts for other wavelength regimes or other galaxy properties such as stellar kinematics or gas surface-density profiles.
  • Cross-survey statistics on rotation-curve shapes or dark-matter halo parameters become feasible at corpus scale rather than galaxy-by-galaxy.

Load-bearing premise

That the kinematic parameters extracted from the four independent surveys can be accurately verified against their primary tables and combined into a consistent unified format without introducing significant systematic biases or loss of information.

What would settle it

A direct comparison in which velocity or radius values in the released corpus differ from the scanned primary tables by amounts larger than the quoted uncertainties for a substantial fraction of galaxies.

Figures

Figures reproduced from arXiv: 2604.13489 by David C. Flynn.

Figure 1
Figure 1. Figure 1: DDO 161 (SPARC Tier 1) loaded from corpus JSON. Blue circles: [PITH_FULL_IMAGE:figures/full_fig_p011_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: WALLABY J165901−601241 (Tier 2) loaded from corpus JSON. The 50 km/s beam-smearing caution zone is shaded. Metadata (D = 15.2 Mpc, inc = 50.7 ◦, 37 rings) is extracted directly from the JSON. 6.3. Example 3: Corpus-level parameter-space exploration [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Corpus population overview. (a) Peak rotation velocity distribution across all four [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗
read the original abstract

We present a unified corpus of 8,963 spatially resolved HI rotation curve measurements across 423 galaxies (438 total catalog entries including 15 metadata-only THINGS galaxies), drawn from four major surveys: SPARC (175), THINGS (34), LITTLE THINGS (26), and WALLABY DR2 (203). The corpus is distributed as a single structured JSON file with nested per-ring kinematic data, survey metadata, column definitions, and data-quality annotations, accompanied by a 438-row flat CSV for catalog-level filtering. All radii are in kiloparsecs, all velocities in km/s. Kinematic parameters have been verified against scanned primary tables. A two-tier quality system distinguishes hand-curated rotation curves with per-point uncertainties (Tier 1) from automated pipeline products (Tier 2). The corpus was designed for both traditional numerical analysis and Large Language Model retrieval-augmented generation (RAG) pipelines. Three worked examples demonstrate single-galaxy rotation curve plotting, multi-component baryonic analysis, and corpus-level parameter-space exploration, each requiring fewer than 15 lines of Python. The corpus is publicly available at Zenodo (DOI: 10.5281/zenodo.19563417) under CC BY 4.0.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper presents a unified corpus of 8,963 spatially resolved HI rotation curve measurements from 423 galaxies (438 catalog entries total, including 15 metadata-only THINGS entries), compiled from SPARC (175), THINGS (34), LITTLE THINGS (26), and WALLABY DR2 (203) surveys. The data are released as a structured JSON file with nested per-ring kinematics, metadata, and quality annotations, plus a flat CSV catalog; all values are in kpc and km/s. Kinematic parameters are stated to have been verified against scanned primary tables. A two-tier quality system is applied (Tier 1: hand-curated with uncertainties; Tier 2: automated pipeline products). The corpus is designed for numerical analysis and LLM RAG use, with three short Python examples provided, and is publicly available on Zenodo under CC BY 4.0.

Significance. If the unification and verification steps are robustly documented and free of systematic transcription or format-conversion errors, the corpus would provide a valuable standardized resource for computational astrophysics. Strengths include the public Zenodo release, dual JSON/CSV formats suitable for both traditional analysis and RAG pipelines, explicit quality tiering, and the inclusion of concise worked examples that lower the barrier to entry for users.

major comments (1)
  1. [Abstract] Abstract: The central claim that 'Kinematic parameters have been verified against scanned primary tables' is load-bearing for the assertion of a faithful unified corpus, yet the manuscript provides no description of the verification protocol, discrepancy-resolution procedures, quantitative agreement metrics (e.g., fraction of points altered or maximum residuals), or how conflicts between scanned values and pipeline outputs were adjudicated. This omission is particularly material for the 203 WALLABY DR2 galaxies (Tier 2, automated), where any systematic offset in ring selection or velocity extraction would propagate into the 8,963-row dataset without being flagged beyond the tier label.
minor comments (2)
  1. [Data release description] The distinction between 423 galaxies and 438 catalog entries (due to 15 metadata-only THINGS entries) is clearly stated in the abstract but should be reiterated in the data-release section to avoid potential user confusion when filtering the CSV catalog.
  2. [Examples section] The manuscript states that the three worked examples require fewer than 15 lines of Python; including the actual code snippets (or links to a companion notebook) in the text would improve immediate reproducibility without increasing length substantially.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for highlighting the importance of transparent documentation for the verification process. We agree that additional details are needed to substantiate the claim in the abstract and will revise the manuscript to address this.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that 'Kinematic parameters have been verified against scanned primary tables' is load-bearing for the assertion of a faithful unified corpus, yet the manuscript provides no description of the verification protocol, discrepancy-resolution procedures, quantitative agreement metrics (e.g., fraction of points altered or maximum residuals), or how conflicts between scanned values and pipeline outputs were adjudicated. This omission is particularly material for the 203 WALLABY DR2 galaxies (Tier 2, automated), where any systematic offset in ring selection or velocity extraction would propagate into the 8,963-row dataset without being flagged beyond the tier label.

    Authors: We acknowledge that the current manuscript does not provide a detailed description of the verification protocol, which is a valid point. In the revised version, we will add a new subsection titled 'Verification Against Primary Tables' (placed after the survey compilation description) that explicitly outlines: the manual scanning and digitization workflow applied to the original published tables for all surveys; the quantitative metrics computed (including per-survey fractions of adjusted points and maximum absolute residuals); the discrepancy adjudication rules (prioritizing scanned primary values, with automated pipeline outputs used only as fallback for Tier 2 entries); and survey-specific statistics, with dedicated discussion of the WALLABY DR2 sample. These additions will be supported by a supplementary table summarizing agreement levels. This revision strengthens the manuscript without changing the released dataset. revision: yes

Circularity Check

0 steps flagged

No circularity: pure data aggregation with external verification

full rationale

The paper compiles and formats existing HI rotation curve data from four independent surveys (SPARC, THINGS, LITTLE THINGS, WALLABY DR2) into a unified JSON/CSV corpus. No derivations, equations, model fits, predictions, or first-principles claims are present. The central claim rests on verification against scanned primary tables from those external surveys, which is an independent check rather than a self-referential reduction. No self-citations, ansatzes, or renamings of results occur in any load-bearing step. This is a standard data-release manuscript with no internal circular reasoning.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a data curation paper that aggregates existing observational measurements; it introduces no new physical models, free parameters, axioms, or invented entities beyond the standard assumptions of HI rotation curve observations from the cited surveys.

pith-pipeline@v0.9.0 · 5538 in / 1259 out tokens · 64789 ms · 2026-05-10T13:12:04.277396+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A Unified H i Rotation Curve Database for 129 Local Volume Dwarf and Irregular Galaxies

    astro-ph.GA 2026-05 accept novelty 5.0

    A unified HI rotation curve database for 129 Local Volume dwarf and irregular galaxies with standardized parameters, quality tiers, and machine-readable formats.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages · cited by 1 Pith paper

  1. [1]

    Navarro, C.S

    J.F. Navarro, C.S. Frenk, S.D.M. White, The structure of cold dark matter halos, ApJ 462 (1996) 563

  2. [2]

    Milgrom, A modification of the Newtonian dynamics as a possible al- ternative to the hidden mass hypothesis, ApJ 270 (1983) 365

    M. Milgrom, A modification of the Newtonian dynamics as a possible al- ternative to the hidden mass hypothesis, ApJ 270 (1983) 365

  3. [3]

    S.S.McGaugh, Themassdiscrepancy–accelerationrelation, ApJ609(2004) 652

  4. [4]

    C., & Cannaliato, J

    D.C. Flynn, J. Cannaliato, A new empirical fit to galaxy rota- tion curves, Frontiers in Astronomy and Space Sciences 12 (2025). doi:10.3389/fspas.2025.1680387

  5. [5]

    Lelli, S.S

    F. Lelli, S.S. McGaugh, J.M. Schombert, SPARC: Mass models for 175 disk galaxies with Spitzer photometry and accurate rotation curves, AJ 152 (2016) 157

  6. [6]

    Walter, E

    F. Walter, E. Brinks, W.J.G. de Blok, et al., THINGS: The HI Nearby Galaxy Survey, AJ 136 (2008) 2563. 17

  7. [7]

    W.J.G.deBlok, F.Walter, E.Brinks, etal., High-resolutionrotationcurves and galaxy mass models from THINGS, AJ 136 (2008) 2648

  8. [8]

    S.-H. Oh, D.A. Hunter, E. Brinks, et al., High-resolution mass models of dwarf galaxies from LITTLE THINGS, AJ 149 (2015) 180

  9. [9]

    Westmeier, N

    T. Westmeier, N. Deg, K. Spekkens, et al., WALLABY: an SKA Pathfinder HI survey, PASA 39 (2022) e058

  10. [10]

    N. Deg, K. Spekkens, T. Westmeier, et al., WALLABY kinematic mod- elling, PASA 39 (2022) e059

  11. [11]

    Murugeshan, N

    C. Murugeshan, N. Deg, T. Westmeier, et al., WALLABY pilot survey: public data release of∼1800 HI sources and high-resolution cut-outs from Pilot Survey Phase 2, PASA 41 (2024) e088

  12. [12]

    Di Teodoro, F

    E.M. Di Teodoro, F. Fraternali, 3D Barolo: a new 3D algorithm to derive rotation curves of galaxies, MNRAS 451 (2015) 3021

  13. [13]

    Kamphuis, G.I.G

    P. Kamphuis, G.I.G. Józsa, S.-H. Oh, et al., Automated kinematic mod- elling of warped galaxies, MNRAS 452 (2015) 3139. 18