Recognition: unknown
A useful representation of TESS light curves
Pith reviewed 2026-05-08 16:08 UTC · model grok-4.3
The pith
A quantile-graph representation projected onto a self-organizing map organizes TESS light curves by amplitude, signal-to-noise ratio, timescale, and shape.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We present a simple and interpretable representation of TESS light curves designed for large-scale exploratory analysis. Our goal is not to optimize classification performance, but to construct a computationally efficient mapping in which proximity reflects meaningful similarity, without using labels or explicit period information as inputs. We represent each light curve using either quantile graphs or scattering transforms, reduce dimensionality with principal component analysis, and project the resulting features onto a self-organizing map. We evaluate ~1500 model configurations using a combination of standard embedding diagnostics and a light-curve-shape-based cohesion metric, and select
What carries the argument
The self-organizing map built from principal components of quantile-graph encodings of the light curves. It places the reduced features on a two-dimensional grid so that proximity corresponds to similarity in variability amplitude, timescale, shape, and signal quality.
Load-bearing premise
The light-curve-shape cohesion metric and embedding diagnostics correctly identify a configuration in which map proximity reflects genuine similarity in stellar variability rather than artifacts of the selection process itself.
What would settle it
Finding that repeat observations of the same stars frequently land in distant or non-contiguous regions of the map would show that the representation fails to capture persistent properties.
Figures
read the original abstract
We present a simple and interpretable representation of TESS light curves designed for large-scale exploratory analysis. Our goal is not to optimize classification performance, but to construct a computationally efficient mapping in which proximity reflects meaningful similarity, without using labels or explicit period information as inputs. We represent each light curve using either quantile graphs or scattering transforms, reduce dimensionality with principal component analysis, and project the resulting features onto a self-organizing map (SOM). We evaluate ~1500 model configurations using a combination of standard embedding diagnostics and a light-curve-shape-based cohesion metric, and select a compact quantile-graph-based model that balances interpretability, stability, and performance. Applying the model to ~1.5 million TESS 2-minute cadence light curves, we find that the map organizes sources primarily by variability amplitude, signal-to-noise ratio, characteristic timescale, and light-curve shape. Repeat observations of the same stars show that most sources occupy stable and contiguous regions of the map, indicating that the representation captures persistent properties rather than noise and systematics. We provide an interactive web interface at http://tess-l8.space that enables inspection of nodes, nearest neighbors, and individual sources across sectors. The resulting representation serves as a practical tool for exploration, anomaly detection, and dataset characterization, and illustrates how simple, deterministic encodings can yield useful structure in large astronomical time-series datasets.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a representation for TESS light curves using quantile graphs (or scattering transforms), PCA dimensionality reduction, and a self-organizing map (SOM). After systematically evaluating ~1500 configurations with standard embedding diagnostics plus a custom light-curve-shape cohesion metric, the authors select a compact quantile-graph model. They apply it to ~1.5 million 2-minute TESS light curves and report that the resulting map organizes sources primarily by variability amplitude, signal-to-noise ratio, characteristic timescale, and light-curve shape. Repeat observations of the same stars are shown to occupy stable, contiguous regions, suggesting the representation captures persistent source properties. An interactive web interface is provided for exploration.
Significance. If the central claims hold after addressing selection concerns, the work offers a practical, computationally efficient, and interpretable tool for large-scale exploratory analysis of TESS time-series data. Strengths include the scale of application (1.5M light curves), the public interactive interface, emphasis on deterministic encodings without labels or periods, and potential utility for anomaly detection and dataset characterization. This aligns with needs in astro-ph.IM for methods that reveal structure in high-volume survey data.
major comments (2)
- [Methods (model evaluation and selection)] Methods section describing model selection: The final configuration is chosen after evaluating ~1500 variants using a combination of standard diagnostics and a light-curve-shape-based cohesion metric. Because this metric is defined in terms of shape similarity (one of the four claimed organizing axes) and selection occurs after inspecting all results, the observed organization by amplitude, SNR, timescale, and shape on the 1.5M sources, as well as the repeat-observation stability, could be inflated by post-hoc optimization rather than reflecting an intrinsic property of the encoding. The manuscript should specify whether selection criteria were pre-registered, whether held-out data or external labels were used during choice, and report quantitative scores (e.g., cohesion values, embedding quality metrics) for the selected model versus representative alternatives.
- [Results (application and stability analysis)] Results section on application to 1.5M light curves: The claim that proximity in the map reflects meaningful similarity rests on the selected SOM being a faithful embedding. Without explicit details on data exclusion rules, sector coverage, or how sources with multiple observations were chosen for the stability test, it is unclear whether the reported organization and contiguity are robust to reasonable variations in the input sample.
minor comments (2)
- [Abstract and Results] The abstract states that the representation 'organizes sources primarily by' four properties, but the main text should include quantitative support (e.g., correlation coefficients or variance explained by each axis) rather than qualitative description alone.
- [Discussion or Conclusions] The interactive interface at http://tess-l8.space is a valuable contribution; a brief description of its features (node inspection, nearest-neighbor search, sector navigation) should be added to the main text or a dedicated subsection for readers who do not immediately access the site.
Simulated Author's Rebuttal
We thank the referee for their constructive review and positive assessment of the manuscript's potential utility. We address each major comment below and will revise the manuscript accordingly to improve clarity and address concerns about selection and robustness.
read point-by-point responses
-
Referee: [Methods (model evaluation and selection)] The final configuration is chosen after evaluating ~1500 variants using a combination of standard diagnostics and a light-curve-shape-based cohesion metric. Because this metric is defined in terms of shape similarity and selection occurs after inspecting all results, the observed organization could be inflated by post-hoc optimization. The manuscript should specify whether selection criteria were pre-registered, whether held-out data or external labels were used, and report quantitative scores for the selected model versus alternatives.
Authors: We acknowledge that the cohesion metric, being based on shape similarity, could introduce some dependence when assessing the shape axis. However, the other organizing axes (amplitude, SNR, and timescale) were evaluated using independent standard embedding diagnostics such as trustworthiness, continuity, and neighborhood preservation, none of which rely on the cohesion metric. Selection criteria were not pre-registered, as this practice is not standard for unsupervised exploratory methods in astronomy; no external labels were used at any stage. We will add a dedicated subsection in the Methods section that reports the full set of quantitative scores (cohesion, trustworthiness, etc.) for the selected model alongside several representative alternatives (e.g., the next-best models by each metric). This will allow readers to assess the sensitivity of the final choice. revision: yes
-
Referee: [Results (application and stability analysis)] Without explicit details on data exclusion rules, sector coverage, or how sources with multiple observations were chosen for the stability test, it is unclear whether the reported organization and contiguity are robust to reasonable variations in the input sample.
Authors: We will expand the Results section (and add a brief methods subsection on data preparation) to specify the exact exclusion rules applied to the 1.5 million light curves, including quality-flag thresholds, minimum cadence coverage, and any sector-specific filters. The sample comprises all publicly available 2-minute TESS sectors at the time of analysis. For the stability test, we included every source with observations in two or more sectors and used the chronologically first two observations per source; we will also report a supplementary check using randomly selected pairs to confirm that contiguity is not sensitive to this choice. These additions will make the robustness explicit. revision: yes
Circularity Check
No significant circularity; representation constructed bottom-up from data features without definitional reduction.
full rationale
The paper builds the representation via quantile graphs or scattering transforms, followed by PCA dimensionality reduction and SOM projection. Model selection among ~1500 configurations relies on standard embedding diagnostics plus a custom cohesion metric, after which the map is applied to 1.5M light curves and observed to organize by amplitude, SNR, timescale, and shape, with stability confirmed on repeat observations. No equations are presented that define the output organization in terms of the selection metric itself, no self-citations are load-bearing, and no ansatz or uniqueness theorem is invoked to force the result. The derivation remains self-contained and data-driven rather than tautological.
Axiom & Free-Parameter Ledger
free parameters (2)
- PCA dimensionality
- SOM grid size and training parameters
axioms (2)
- domain assumption Quantile graphs capture sufficient information about light-curve shape and amplitude for similarity assessment.
- standard math Self-organizing maps arrange inputs so that proximity corresponds to feature-space similarity.
Reference graph
Works this paper leans on
-
[1]
Private Communications
Me, Moi. Private Communications. OJA. doi:000
-
[2]
Microseismic Noise Mitigation with Machine Learning for Advanced LIGO. arXiv e-prints , keywords =. doi:10.48550/arXiv.2511.19682 , archivePrefix =. 2511.19682 , primaryClass =
-
[3]
Exploration of groups and outliers in Gaia RVS stellar spectra with metric learning. arXiv e-prints , keywords =. doi:10.48550/arXiv.2508.00071 , archivePrefix =. 2508.00071 , primaryClass =
-
[4]
Using Anomaly Detection to Search for Technosignatures in Breakthrough Listen Observations. , keywords =. doi:10.3847/1538-3881/add52b , archivePrefix =. 2505.03927 , primaryClass =
-
[5]
A semisupervised machine learning search for never-seen gravitational-wave sources. , keywords =. doi:10.1093/mnras/staa3550 , archivePrefix =. 2010.11949 , primaryClass =
-
[6]
Astronomy and Computing , keywords =
Effectively using unsupervised machine learning in next generation astronomical surveys. Astronomy and Computing , keywords =. doi:10.1016/j.ascom.2020.100437 , archivePrefix =. 1911.06823 , primaryClass =
-
[7]
Redshifted broad absorption line quasars found via machine-learned spectral similarity. , keywords =. doi:10.1093/mnras/sty2127 , archivePrefix =. 1805.09829 , primaryClass =
-
[8]
Detecting outliers and learning complex structures with large spectroscopic surveys - a case study with APOGEE stars. , keywords =. doi:10.1093/mnras/sty348 , archivePrefix =. 1711.00022 , primaryClass =
-
[9]
The weirdest SDSS galaxies: results from an outlier detection algorithm. , keywords =. doi:10.1093/mnras/stw3021 , archivePrefix =. 1611.07526 , primaryClass =
-
[10]
2023, ApJS, 268, 4, doi: 10.3847/1538-4365/acdee5
Variability Catalog of Stars Observed during the TESS Prime Mission. , keywords =. doi:10.3847/1538-4365/acdee5 , archivePrefix =. 2208.11721 , primaryClass =
-
[11]
Multilayer Quantile Graph for Multivariate Time Series Analysis and Dimensionality Reduction. arXiv e-prints , keywords =. doi:10.48550/arXiv.2311.11849 , archivePrefix =. 2311.11849 , primaryClass =
-
[12]
Time Series Analysis via Network Science: Concepts and Algorithms. arXiv e-prints , keywords =. doi:10.48550/arXiv.2110.09887 , archivePrefix =. 2110.09887 , primaryClass =
-
[13]
Group Invariant Scattering. arXiv e-prints , keywords =. doi:10.48550/arXiv.1101.2286 , archivePrefix =. 1101.2286 , primaryClass =
-
[14]
1995 , doi =
Kohonen, Teuvo , title =. 1995 , doi =
1995
-
[15]
Campanharo, Andriana S. L. O. and Sirer, M. Irmak and Malmgren, R. Dean and Ramos, Fernando M. and Amaral, Luís A. Nunes , title =. PLoS ONE , volume =. 2011 , doi =
2011
-
[16]
Proceedings of the IEEE International Conference on Neural Networks (ICNN'96) , volume =
Kiviluoto, Kimmo , title =. Proceedings of the IEEE International Conference on Neural Networks (ICNN'96) , volume =. 1996 , doi =
1996
-
[17]
Advances in Self-Organizing Maps , editor =
Ultsch, Alfred and Siemon, Hans Peter , title =. Advances in Self-Organizing Maps , editor =. 2009 , doi =
2009
-
[18]
Proceedings of the Workshop on Self-Organizing Maps (WSOM'05) , pages =
Venna, Jarkko and Kaski, Samuel , title =. Proceedings of the Workshop on Self-Organizing Maps (WSOM'05) , pages =. 2005 , url =
2005
-
[19]
Proceedings of the 11th International Conference on Artificial Intelligence and Statistics (AISTATS) , pages =
Venna, Jarkko and Kaski, Samuel , title =. Proceedings of the 11th International Conference on Artificial Intelligence and Statistics (AISTATS) , pages =. 2007 , url =
2007
-
[20]
Jenkins, J. M. and Twicken, J. D. and McCauliff, S. and Campbell, J. and Sanderfer, D. and Lung, D. and others , title =. Proceedings of the SPIE , volume =. 2016 , doi =
2016
-
[21]
Smith, J. C. and Stumpe, M. C. and Van Cleve, J. E. and Jenkins, J. M. and Barclay, T. and Fanelli, M. N. and others , title =. Publications of the Astronomical Society of the Pacific , volume =. 2012 , doi =
2012
-
[22]
Stumpe, M. C. and Smith, J. C. and Van Cleve, J. E. and Twicken, J. D. and Barclay, T. and Fanelli, M. N. and others , title =. Publications of the Astronomical Society of the Pacific , volume =. 2012 , doi =
2012
-
[23]
Stumpe, M. C. and Smith, J. C. and Catanzarite, J. H. and Van Cleve, J. E. and Jenkins, J. M. and Twicken, J. D. and others , title =. Publications of the Astronomical Society of the Pacific , volume =. 2014 , doi =
2014
-
[24]
2000, A&AS, 143, 9, doi: 10.1051/aas:2000332
The SIMBAD astronomical database. The CDS reference database for astronomical objects. , keywords =. doi:10.1051/aas:2000332 , archivePrefix =. astro-ph/0002110 , primaryClass =
-
[25]
and Winn, Joshua N
Ricker, George R. and Winn, Joshua N. and Vanderspek, Roland and others , title =. Journal of Astronomical Telescopes, Instruments, and Systems , year =
-
[26]
and Kulkarni, Shrinivas R
Bellm, Eric C. and Kulkarni, Shrinivas R. and Graham, Matthew J. and others , title =. Publications of the Astronomical Society of the Pacific , year =
-
[27]
and Starr, Dan L
Richards, Joseph W. and Starr, Dan L. and Butler, Nathaniel R. and others , title =. The Astrophysical Journal , year =
-
[28]
The Astrophysical Journal Supplement Series , year =
Fetherolf, Tara and Pepper, Joshua and Simpson, Emilie and others , title =. The Astrophysical Journal Supplement Series , year =
-
[29]
The Astrophysical Journal Supplement Series , year =
Gao, Xinyi and Chen, Xiaodian and Wang, Shu and Liu, Jifeng , title =. The Astrophysical Journal Supplement Series , year =
-
[30]
Naul, Brett and Bloom, Joshua S. and P. A Recurrent Neural Network for Classification of Unevenly Sampled Variable Stars , journal =. 2018 , volume =
2018
-
[31]
2019, arXiv preprint arXiv:1904.07248
Baron, Dalya , title =. arXiv e-prints , year =. 1904.07248 , archivePrefix =
-
[32]
Astrophysics and Space Science , year =
Audenaert, Jeroen , title =. Astrophysics and Space Science , year =
-
[33]
Junell, Alexandra and Sasli, Argyro and Fontinele Nunes, Felipe and others , title =. arXiv e-prints , year =. 2507.16088 , archivePrefix =
-
[34]
Finkbeiner, Douglas P. and Prince, Thomas A. and Whitebook, Samuel E. , title =. arXiv e-prints , year =. 2502.00243 , archivePrefix =
-
[35]
and Armstrong, David J
Battley, Matthew P. and Armstrong, David J. and Pollacco, Don , title =. Monthly Notices of the Royal Astronomical Society , year =
-
[36]
and Poznanski, D
Pardo, S. and Poznanski, D. and Croft, S. and Siemion, A. P. V. and Lebofsky, M. , title =. The Astronomical Journal , year =
-
[37]
and Poznanski, D
Marianer, T. and Poznanski, D. and Prochaska, J. X. , title =. Monthly Notices of the Royal Astronomical Society , year =
-
[38]
and Rotman, M
Reis, I. and Rotman, M. and Poznanski, D. and Prochaska, J. X. and Wolf, L. , title =. Astronomy and Computing , year =
-
[39]
and Poznanski, D
Reis, I. and Poznanski, D. and Baron, D. and Zasowski, G. and Shahaf, S. , title =. Monthly Notices of the Royal Astronomical Society , year =
-
[40]
and Poznanski, D
Baron, D. and Poznanski, D. , title =. Monthly Notices of the Royal Astronomical Society , year =
-
[41]
Ivezi. LSST: From Science Drivers to Reference Design and Anticipated Data Products , journal =. 2019 , volume =. doi:10.3847/1538-4357/ab042c , archivePrefix =
-
[42]
Ashley and Berger, Edo , title =
Zabriskie, Connor and Villar, V. Ashley and Berger, Edo , title =. The Astrophysical Journal , year =. doi:10.3847/1538-4357/acab67 , adsurl =
-
[43]
Ashley and Cranmer, Kyle and Berger, Edo , title =
Villar, V. Ashley and Cranmer, Kyle and Berger, Edo , title =. The Astrophysical Journal , year =. doi:10.3847/1538-4357/abfcd9 , adsurl =
-
[44]
Soraisam, M. D. and others , title =. Monthly Notices of the Royal Astronomical Society , year =. doi:10.1093/mnras/staa3023 , adsurl =
-
[45]
Brett, D. R. and West, R. G. and Wheatley, P. J. , title =. Monthly Notices of the Royal Astronomical Society , year =. doi:10.1093/mnras/353.2.369 , url =
-
[46]
Armstrong, D. J. and Kirk, J. and Lam, K. W. F. and McCormac, J. and Osborn, H. P. and Spake, J. and Walker, S. and Brown, D. J. A. and Kristiansen, M. H. and Pollacco, D. and West, R. and Wheatley, P. J. , title =. Monthly Notices of the Royal Astronomical Society , year =. doi:10.1093/mnras/stv2836 , eprint =
-
[47]
Debosscher, J. and Sarro, L. M. and Aerts, C. and L. Automated supervised classification of variable stars. I. Methodology , journal =. 2007 , volume =. doi:10.1051/0004-6361:20077638 , eprint =
-
[48]
2022, A&A, 666, A76, doi: 10.1051/0004-6361/202243469
Audenaert, Jeroen and Tkachenko, Andrew , title =. Astronomy & Astrophysics , year =. doi:10.1051/0004-6361/202243469 , eprint =
-
[49]
Crake, D. A. and Mart. Linking Anomalous Behaviour with Stellar Properties: An Unsupervised Exploration of TESS Light Curves , journal =. 2023 , eprint =. doi:10.48550/arXiv.2301.10264 , url =
-
[50]
Scalable End-to-end Recurrent Neural Network for Variable star classification , journal =
Becker, Ignacio and Pichara, Karim and Catelan, M. Scalable End-to-end Recurrent Neural Network for Variable star classification , journal =. 2020 , volume =. doi:10.1093/mnras/staa350 , eprint =
-
[51]
Barbara, Nicholas H. and Bedding, Timothy R. and Fulcher, Ben D. and Murphy, Simon J. and Van Reeth, Timothy , title =. Monthly Notices of the Royal Astronomical Society , year =. doi:10.1093/mnras/stac1515 , eprint =
-
[52]
ASTROMER: A transformer-based embedding for the representation of light curves , journal =
Donoso-Oliva, Crist. ASTROMER: A transformer-based embedding for the representation of light curves , journal =. 2023 , volume =. doi:10.1051/0004-6361/202243928 , eprint =
-
[53]
2025, arXiv e-prints, arXiv:2502.02717, doi: 10.48550/arXiv.2502.02717 —
Donoso-Oliva, Crist. Astromer 2 , journal =. 2025 , eprint =. doi:10.48550/arXiv.2502.02717 , url =
-
[54]
Rizhko, Maria and Bloom, Joshua S. , title =. arXiv e-prints , year =. doi:10.48550/arXiv.2411.08842 , url =. 2411.08842 , archivePrefix =
-
[55]
2020, MNRAS, 498, 5972, doi: 10.1093/mnras/staa2745
Nardiello, D , title =. Monthly Notices of the Royal Astronomical Society , volume =. 2020 , month =. doi:10.1093/mnras/staa2745 , url =
- [56]
- [57]
-
[58]
1877 , publisher =
Tolstoy, Leo , title =. 1877 , publisher =
-
[59]
The Revised TESS Input Catalog and Candidate Target List. , keywords =. doi:10.3847/1538-3881/ab3467 , archivePrefix =. 1905.10694 , primaryClass =
-
[60]
The TESS All-Sky Rotation Survey: Periods for 1,046,317 Stars Within 500 pc
The TESS All-Sky Rotation Survey: Periods for 944,056 Stars Within 500 pc. arXiv e-prints , keywords =. doi:10.48550/arXiv.2603.05586 , archivePrefix =. 2603.05586 , primaryClass =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2603.05586
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.