A new Gaia census of OB associations within 1 kpc
Pith reviewed 2026-05-17 00:43 UTC · model grok-4.3
The pith
Using Gaia astrometry on 25,000 O- and B-type stars, a new census identifies 56 OB associations within 1 kpc, doubling the known count.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We exploit a complete census of ∼25,000 O- and B-type stars within 1 kpc of the Sun to produce a highly-reliable catalogue of 56 OB associations using the HDBSCAN clustering algorithm, increasing the number of known OB associations by a factor of two within this volume. We assess the validity of this catalogue by crossmatching our OB association members with other catalogues of OB associations, star clusters and young stellar groups, confirming the high-confidence of our census of OB associations. We characterize these OB associations physically (total initial stellar mass, number of OB stars, ...) and kinematically (velocity dispersion, linear expansion ages, ...). The majority of the OB (
What carries the argument
The HDBSCAN density-based clustering algorithm applied to Gaia astrometric data of O- and B-type stars to group them into physical associations.
Load-bearing premise
The chosen HDBSCAN parameters and cross-matching procedure reliably separate true physical OB associations from chance alignments or unrelated stars.
What would settle it
Detailed radial-velocity or age measurements showing that a large fraction of stars assigned to the associations have velocities or properties inconsistent with membership in the same physical group.
Figures
read the original abstract
OB associations are primordial tracers of star formation and Galactic structure. Originally defined about 80 years ago, their historical membership lists have been superseded thanks to the precise astrometry from ESA's \textit{Gaia}'s satellite. Recent studies have however been mostly focused on individual OB associations or limited by the coverage of spectroscopic surveys. In this paper, we exploit a complete census of $\sim$25,000 O- and B-type stars within 1 kpc of the Sun to produce a highly-reliable catalogue of 56 OB associations using the HDBSCAN clustering algorithm, increasing the number of known OB associations by a factor of two within this volume. We assess the validity of this catalogue by crossmatching our OB association members with other catalogues of OB associations, star clusters and young stellar groups, confirming the high-confidence of our census of OB associations. We characterize these OB associations physically (total initial stellar mass, number of OB stars, ...) and kinematically (velocity dispersion, linear expansion ages, ...). The majority of the OB associations (38 out of 56) exhibit a significant expansion pattern in at least one direction, including 12 in both plane-of-the-sky directions, though differences in expansion velocity suggest anisotropical expansion patterns. We compare the locations of these OB associations with superclouds and features in the local Milky Way such as the Radcliffe Wave and discuss the implications for star formation in the solar neighbourhood.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper exploits a complete Gaia-based census of ~25,000 O- and B-type stars within 1 kpc to identify 56 OB associations via HDBSCAN clustering, claiming this doubles the number of known associations in the volume. Validity is assessed through cross-matching with existing OB association, cluster, and young-group catalogues. The associations are then characterized by total initial stellar mass, number of OB stars, velocity dispersion, and linear expansion ages; 38 show significant expansion (12 in both plane-of-sky directions), and their locations are compared to superclouds and features such as the Radcliffe Wave.
Significance. If the catalogue proves reliable, the work would provide a valuable doubled census of local OB associations with kinematic and mass characterizations, enabling improved studies of star formation, expansion patterns, and connections to Galactic structures. The use of a complete stellar sample and HDBSCAN is a methodological strength, though the central reliability claim requires quantitative support.
major comments (2)
- [§5] §5 (Validation and cross-matching): The manuscript states that validity was assessed via cross-matching with other catalogues, confirming high confidence. However, this procedure primarily recovers previously known structures and provides no reported purity, completeness, false-discovery-rate, or contamination statistics specifically for the 28 newly identified associations. No tests against randomized realizations or parameter-sensitivity analyses for HDBSCAN are described, leaving the 'highly-reliable' claim for the new entries unquantified and load-bearing for the doubled-census result.
- [§4] §4 (The OB association catalogue): The central claim of a factor-of-two increase rests on the 56 associations being physically distinct groups rather than chance alignments. Without the missing quantitative validation metrics noted above, the expansion-age and mass characterizations for the new associations cannot be confidently interpreted as representative of true OB associations.
minor comments (2)
- [Abstract / §3] The abstract refers to 'linear expansion ages' and 'anisotropical expansion patterns' without a brief definition or reference to the exact fitting procedure used; this should be clarified in the methods or results section for reproducibility.
- [Table 1 / §4] Table or figure presenting the full catalogue should include columns for the number of member stars, estimated mass, and expansion velocity components to allow direct comparison with prior catalogues.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback on our manuscript. We address the major comments regarding the validation and reliability of the new OB associations in the point-by-point responses below. We agree that additional quantitative support would strengthen the claims and will revise the manuscript accordingly where possible.
read point-by-point responses
-
Referee: §5 (Validation and cross-matching): The manuscript states that validity was assessed via cross-matching with other catalogues, confirming high confidence. However, this procedure primarily recovers previously known structures and provides no reported purity, completeness, false-discovery-rate, or contamination statistics specifically for the 28 newly identified associations. No tests against randomized realizations or parameter-sensitivity analyses for HDBSCAN are described, leaving the 'highly-reliable' claim for the new entries unquantified and load-bearing for the doubled-census result.
Authors: We acknowledge that the cross-matching primarily validates the recovery of known associations and does not directly quantify the reliability of the 28 new ones through metrics such as purity or false discovery rate. The 'highly-reliable' description is based on the consistent application of HDBSCAN to a complete sample and the physical coherence of the detected groups, including their expansion signatures. However, we agree that this leaves the claim somewhat unquantified. In the revised manuscript, we will include a parameter-sensitivity analysis for HDBSCAN by varying the min_cluster_size and min_samples parameters and reporting how many associations remain stable. We will also discuss the limitations regarding the lack of randomized tests for the new associations. revision: partial
-
Referee: §4 (The OB association catalogue): The central claim of a factor-of-two increase rests on the 56 associations being physically distinct groups rather than chance alignments. Without the missing quantitative validation metrics noted above, the expansion-age and mass characterizations for the new associations cannot be confidently interpreted as representative of true OB associations.
Authors: The factor-of-two increase is determined by comparing our catalogue of 56 associations to the number previously known within 1 kpc from literature compilations. We maintain that the associations are physically distinct because they are identified as overdensities in the 6D phase space (position and velocity) using HDBSCAN, and a majority exhibit expansion, which is unlikely for chance alignments. Nevertheless, we recognize the referee's concern about interpreting the characterizations for the new associations. We will add a caveat in the revised text noting that while the method is uniform, independent confirmation for the new groups is limited to the internal consistency of the data, and future spectroscopic follow-up would be beneficial. revision: yes
- Full Monte Carlo simulations to compute purity, completeness, and false discovery rates for the newly identified associations
Circularity Check
Direct application of HDBSCAN to external Gaia data with external cross-match validation
full rationale
The paper applies the HDBSCAN clustering algorithm directly to an external complete census of ~25,000 O- and B-type stars from Gaia within 1 kpc. The resulting catalogue of 56 associations is validated by cross-matching members against independent external catalogues of OB associations, star clusters, and young stellar groups. No equations, fitted parameters, or self-citations are used to derive the associations themselves; the chain consists of standard clustering on public astrometric data followed by external confirmation. This is self-contained against external benchmarks with no reduction of outputs to inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (1)
- HDBSCAN hyperparameters
axioms (1)
- domain assumption Gaia DR3 (or equivalent) astrometry supplies sufficiently accurate positions, parallaxes, and proper motions for reliable clustering of OB stars within 1 kpc
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we exploit a complete census of ∼25,000 O- and B-type stars within 1 kpc of the Sun to produce a highly-reliable catalogue of 56 OB associations using the HDBSCAN clustering algorithm
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 2 Pith papers
-
Dynamical Cluster Assembly Framework (D-CAF): The Link Between Star Cluster Formation and Expansion Rates
D-CAF simulations show that ongoing gas collapse during star formation shortens stellar crossing times, rendering gas expulsion more adiabatic and thereby regulating the survival and expansion rates of young stellar systems.
-
The flare and spiral structure of the Milky Way's disc as traced by young giant stars
Young giant stars reveal a flaring Milky Way disc with 3.5 kpc radial scale and extended spiral arms including a curved Perseus segment and a new Scutum-associated feature.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.