pith. sign in

A Multi-Survey Machine-Readable Corpus of Milky Way Globular Cluster Parameters for Retrieval-Augmented Generation Applications

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

We present the Milky Way Globular Cluster Corpus v1.3.1, a unified machine-readable database of fundamental parameters for 174 Milky Way globular clusters assembled from four independent published surveys. Each cluster record integrates photometric, structural, and spectroscopically-calibrated metallicity parameters from Harris (1996) (2010 revision), Gaia EDR3 proper motions from Vasiliev & Baumgardt (2021), N-body dynamical masses and orbital parameters from Baumgardt et al. (2023), and mean chemical abundances from the APOGEE DR17 globular cluster Value Added Catalog of Schiavon et al. (2024). The corpus contains 17,438 non-null data points across 174 clusters stored in JSONL, JSON, and flat CSV formats with consistent native-typed fields (float, int, bool, null), embedded provenance blocks, and fully documented schema. Survey coverage is 157/174 clusters for Harris photometry, 170/174 for Gaia EDR3 proper motions, 154/174 for Baumgardt N-body dynamics, and 72/174 for APOGEE DR17 chemistry. The corpus was designed as a Retrieval-Augmented Generation (RAG) knowledge base for large language model applications in astrophysics research, following the same multi-survey integration methodology as the Unified Galaxy HI Rotation Curve Corpus (Flynn 2026), and has been validated for structured context injection with instruction-following language models. It is equally suitable for traditional quantitative analyses including orbit modeling, cluster classification, chemical tagging, and multi-survey cross-validation. The dataset is available at Zenodo DOI: 10.5281/zenodo.19907766.

fields

astro-ph.IM 1

years

2026 1

verdicts

UNVERDICTED 1

clear filters

representative citing papers

citing papers explorer

Showing 1 of 1 citing paper after filters.