pith. sign in

arxiv: 2510.17721 · v2 · submitted 2025-10-20 · 🌌 astro-ph.IM

Graph-Based Light-Curve Features for Robust Transient Classification

Pith reviewed 2026-05-18 05:53 UTC · model grok-4.3

classification 🌌 astro-ph.IM
keywords visibility graphslight curvestransient classificationgraph featuresastronomical time seriesmachine learningMANTRA benchmark
0
0 comments X

The pith

Visibility graphs turn light curves into features that let standard classifiers identify astronomical transients at macro-F1 0.622.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper converts irregular astronomical light curves into three visibility-graph representations: horizontal, directed, and weighted. From each graph it extracts a compact set of network statistics such as degree and strength moments, clustering coefficients, motifs, assortativity, path measures, and spectral summaries. These descriptors feed into tree-based models including LightGBM on a quality-controlled, class-balanced subset of 1705 objects drawn from the MANTRA benchmark. The best combination reaches macro-F1 of 0.622 and accuracy of 0.661, indicating that graph topology can serve as a survey-agnostic bridge from raw photometry to multiclass prediction without custom deep networks.

Core claim

Mapping each light curve to horizontal visibility graphs, directed horizontal visibility graphs, and weighted horizontal visibility graphs, then extracting degree/strength moments, clustering and motifs, assortativity, path/efficiency, and spectral summaries, allows LightGBM to attain a macro-F1 of 0.622 plus or minus 0.010 and accuracy of 0.661 plus or minus 0.010 on the filtered MANTRA subset, with the directed and weighted views supplying complementary information beyond undirected topology.

What carries the argument

The three visibility-graph views (HVG, DHVG, W-HVG) of photometric time series, from which length-aware network descriptors are extracted to form input features for tree-based classifiers.

If this is right

  • Weighted contrasts and directed asymmetry add complementary gains beyond undirected topology.
  • Strong separation occurs for CV, HPM, and Non-Tr. classes while residual confusions concentrate in the AGN-Blazar-SN block.
  • The method yields competitive multiclass results on quality-controlled data without requiring bespoke deep architectures.
  • The approach works on a minimum-coverage subset of at least 100 epochs per object.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same graph descriptors could be tested on light curves from other surveys to check survey-agnostic behavior.
  • Combining these features with existing photometric statistics might further reduce confusions in the AGN-Blazar-SN group.
  • Releasing object IDs and code makes it straightforward to verify the numbers or extend the feature set on new transient catalogs.

Load-bearing premise

The chosen visibility-graph descriptors retain enough discriminative information from the original irregular photometric series to support multiclass separation after quality filtering.

What would settle it

Re-running the identical LightGBM pipeline on the released list of 1705 object IDs and obtaining a macro-F1 below 0.60 would directly test whether the reported performance holds.

read the original abstract

We investigate graph-based representations of astronomical light curves for transient classification on a quality-controlled, class-balanced subset of the MANTRA benchmark (minimum coverage N_min=100 epochs; N=1705 objects after filtering and Non-Tr. subsampling). Each series is mapped to three visibility-graph views -- horizontal (HVG), directed (DHVG), and weighted (W-HVG) -- from which we extract compact, length-aware network descriptors (degree/strength moments, clustering and motifs, assortativity, path/efficiency, and spectral summaries). Using object-level stratified five-fold validation and tree-based learners, the best configuration (LightGBM with HVG+DHVG+W-HVG features) attains a macro-F1 of 0.622 +/- 0.010 and accuracy of 0.661 +/- 0.010 on this subset. For context, the published MANTRA baseline reports F1_macro=0.528 on the full dataset; because class priors differ after quality control, this reference is not a like-for-like comparison. Ablations show that weighted contrasts and directed asymmetry contribute complementary gains to undirected topology. Per-class analysis highlights strong performance for CV, HPM, and Non-Tr., with residual confusions concentrated in the AGN-Blazar-SN block. These results indicate that visibility graphs offer a simple, survey-agnostic bridge between irregular photometric time series and standard classifiers, yielding competitive multiclass performance without bespoke deep architectures. We release code and feature definitions, together with the list of object IDs used in the evaluation subset, to facilitate reproducibility and future extensions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript investigates graph-based representations of astronomical light curves for transient classification. Each light curve is converted to horizontal (HVG), directed (DHVG), and weighted (W-HVG) visibility graphs, from which compact network descriptors (degree/strength moments, clustering, motifs, assortativity, path measures, spectral summaries) are extracted. On a post-filtered, class-balanced subset of the MANTRA benchmark (N_min=100 epochs, N=1705 objects after Non-Tr. subsampling), LightGBM with the combined HVG+DHVG+W-HVG feature set attains macro-F1 of 0.622 ± 0.010 and accuracy of 0.661 ± 0.010 under object-level stratified five-fold cross-validation. Ablations indicate complementary gains from weighted and directed views; per-class results are strong for CV, HPM, and Non-Tr. but show residual confusions in the AGN-Blazar-SN group. Code, feature definitions, and the object-ID list are released.

Significance. If the discriminative power of the visibility-graph descriptors holds under proper controls, the work supplies a simple, survey-agnostic, and computationally lightweight feature pipeline that maps irregular photometric series directly to off-the-shelf classifiers without bespoke deep architectures. Explicit strengths include the release of code and the exact evaluation list, the use of stratified five-fold validation with error bars, and ablations that isolate contributions from directed asymmetry and weighted contrasts. These elements support reproducibility and incremental extension by the community.

major comments (2)
  1. Abstract and Results section: The headline result (LightGBM + HVG+DHVG+W-HVG, macro-F1 0.622 ± 0.010) is offered as competitive with the published MANTRA F1_macro=0.528. The manuscript correctly notes that class priors differ after N_min=100 filtering and Non-Tr. subsampling, rendering the numbers non-comparable. However, no control experiment applies any conventional feature set (statistical moments, Lomb-Scargle summaries, etc.) to the identical 1705-object list under the same learner and same object-level 5-fold CV. Without this baseline, the specific contribution of the visibility-graph descriptors cannot be isolated from the effects of quality filtering and balancing. This is load-bearing for the central claim that the graph features provide a competitive bridge to multiclass classification.
  2. Methods and Ablation sections: The claim that the chosen descriptors (degree moments, clustering, motifs, assortativity, path/efficiency, spectral summaries) retain sufficient discriminative information rests on performance after aggressive filtering to high-coverage objects. No direct test of information retention—such as a comparison of classification performance using the raw light-curve statistics versus the graph-derived features on the same objects—is reported. This leaves open whether simpler descriptors would achieve similar scores on this particular subset.
minor comments (2)
  1. Abstract: The statement that code and the object-ID list are released would benefit from an explicit repository URL or DOI to improve immediate accessibility for readers.
  2. Per-class analysis: The residual confusions concentrated in the AGN-Blazar-SN block are noted qualitatively; inclusion of a normalized confusion matrix or pairwise F1 scores would allow quantitative assessment of the severity of these confusions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed review. The comments highlight important points regarding the isolation of our graph-based features' contribution, and we address each below with plans for revision.

read point-by-point responses
  1. Referee: Abstract and Results section: The headline result (LightGBM + HVG+DHVG+W-HVG, macro-F1 0.622 ± 0.010) is offered as competitive with the published MANTRA F1_macro=0.528. The manuscript correctly notes that class priors differ after N_min=100 filtering and Non-Tr. subsampling, rendering the numbers non-comparable. However, no control experiment applies any conventional feature set (statistical moments, Lomb-Scargle summaries, etc.) to the identical 1705-object list under the same learner and same object-level 5-fold CV. Without this baseline, the specific contribution of the visibility-graph descriptors cannot be isolated from the effects of quality filtering and balancing. This is load-bearing for the central claim that the graph features provide a competitive bridge to multiclass classification.

    Authors: We agree that a direct baseline using conventional features on the identical 1705-object filtered subset under the same LightGBM learner and object-level stratified 5-fold CV is necessary to isolate the specific contribution of the visibility-graph descriptors from the effects of quality filtering and class balancing. In the revised manuscript we will add this control experiment, extracting standard statistical moments and Lomb-Scargle summaries from the same light curves and reporting the resulting macro-F1 and accuracy for comparison. revision: yes

  2. Referee: Methods and Ablation sections: The claim that the chosen descriptors (degree moments, clustering, motifs, assortativity, path/efficiency, spectral summaries) retain sufficient discriminative information rests on performance after aggressive filtering to high-coverage objects. No direct test of information retention—such as a comparison of classification performance using the raw light-curve statistics versus the graph-derived features on the same objects—is reported. This leaves open whether simpler descriptors would achieve similar scores on this particular subset.

    Authors: We concur that a head-to-head comparison of raw light-curve statistics against the graph-derived features on the exact same 1705 high-coverage objects would provide clearer evidence that the network descriptors retain discriminative information beyond simpler summaries. We will include this analysis in the revised manuscript by computing basic statistical features directly from the raw series of these objects and evaluating them under the identical cross-validation protocol. revision: yes

Circularity Check

0 steps flagged

Standard feature extraction plus supervised classification; no derivations reduce to inputs by construction

full rationale

The paper applies visibility-graph mappings (HVG, DHVG, W-HVG) to light curves, extracts standard network statistics (degree moments, clustering, motifs, etc.), and feeds the resulting feature vectors into off-the-shelf classifiers (LightGBM, etc.) under object-level 5-fold CV. Reported macro-F1 and accuracy are direct empirical outcomes of this pipeline on the filtered 1705-object subset. No equations, ansatzes, or self-citations are invoked to derive or force the performance numbers; the workflow is self-contained against the external MANTRA benchmark (with explicit caveats on subset differences). No self-definitional, fitted-input-called-prediction, or load-bearing self-citation patterns appear.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that visibility graphs preserve classification-relevant structure in irregularly sampled light curves and on standard supervised-learning assumptions about feature independence and cross-validation validity.

free parameters (1)
  • N_min
    Threshold of 100 epochs used to define the quality-controlled subset; directly affects which objects enter the evaluation.
axioms (1)
  • domain assumption Visibility graphs constructed from light curves retain sufficient temporal and amplitude information for multiclass discrimination.
    Invoked when mapping photometric series to HVG/DHVG/W-HVG and extracting descriptors for classification.

pith-pipeline@v0.9.0 · 5824 in / 1288 out tokens · 38699 ms · 2026-05-18T05:53:03.472249+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    Each series is mapped to three visibility-graph views—horizontal (HVG), directed (DHVG), and weighted (W-HVG)—from which we extract compact, length-aware network descriptors (degree/strength moments, clustering and motifs, assortativity, path/efficiency, and spectral summaries).

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.