Determining surface rotation periods of solar-like stars observed by the Kepler mission using machine learning techniques

A. Le Saux; A.R.G. Santos; L. Bugnet; P.L. Palle; R.A. Garcia; S. Mathur; S.N. Breton

arxiv: 1906.09609 · v1 · pith:ZNCRUAWFnew · submitted 2019-06-23 · 🌌 astro-ph.SR

Determining surface rotation periods of solar-like stars observed by the Kepler mission using machine learning techniques

S.N. Breton , L. Bugnet , A.R.G. Santos , A. Le Saux , S. Mathur , P.L. Palle , R.A. Garcia This is my paper

Pith reviewed 2026-05-25 17:51 UTC · model grok-4.3

classification 🌌 astro-ph.SR

keywords stellar rotationKepler missionmachine learningrandom forestmain-sequence starssurface rotation periodsphotometric analysissolar-like stars

0 comments

The pith

Random forest classifiers can identify rotating main-sequence stars in Kepler data and select the best method to measure their surface rotation periods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that random forest classifiers can handle two key decisions when measuring surface rotation from Kepler light curves of solar-like stars. The first classifier uses input parameters to sort targets into rotating main-sequence stars, non-rotating main-sequence stars, red giants, binaries, or pulsators. A second classifier then operates only on the rotating main-sequence stars to choose among time-frequency, autocorrelation, or composite-spectrum analysis techniques. This automation matters because rotation periods in principle allow age estimates for large samples of stars, where manual classification and method selection would be impractical. If the classifiers perform as intended, the process becomes scalable to the full Kepler catalog without losing the ability to adapt the analysis to each star's characteristics.

Core claim

The authors claim that random forest classifiers trained on input parameters and appropriate labels can first discriminate Kepler targets into the categories of rotating main-sequence stars, non-rotating main-sequence stars, red giants, binaries, and pulsators, and can then, for the rotating main-sequence subset alone, select the most suitable data-analysis treatment among the available methods for extracting surface rotation periods.

What carries the argument

Two sequential random forest classifiers: the first sorts stars by type using input parameters, while the second chooses the optimal analysis technique (wavelet-based, autocorrelation, or composite spectrum) for each confirmed rotating main-sequence star.

If this is right

Large numbers of Kepler targets can be processed without individual manual inspection at the classification or method-selection stages.
Rotation-period catalogs produced this way will reflect a consistent, data-driven choice of analysis technique for each rotating main-sequence star.
The same two-step classifier structure can be retrained and applied whenever new photometric time series become available for solar-like stars.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the method succeeds on Kepler data, the same classifier architecture could be retrained on TESS light curves to produce rotation periods for a different sample of solar-like stars.
More uniform rotation measurements across many stars would tighten empirical relations between rotation period, color, and age, even if the paper does not demonstrate that tightening.
The approach could be tested for robustness by checking whether the chosen analysis methods produce periods that agree with independent spectroscopic measurements on a subset of stars.

Load-bearing premise

The labels used to train the classifiers accurately reflect both the true stellar categories and the objectively best analysis method for each rotating main-sequence star.

What would settle it

Running the trained classifiers on a held-out set of stars whose categories and preferred analysis methods have been independently verified by human experts, and finding that a large fraction of predictions disagree with those verifications, would show the approach does not work.

Figures

Figures reproduced from arXiv: 1906.09609 by A. Le Saux, A.R.G. Santos, L. Bugnet, P.L. Palle, R.A. Garcia, S. Mathur, S.N. Breton.

**Figure 1.** Figure 1: From top to bottom - example of Kepler photometric lightcurve analyzed with A2Z pipeline (Mathur et al. 2010), wavelet power spectrum, autocorrelation function and composite spectrum. Extracted from Santos et al. (submitted to ApJ). Our set of stars consist of 14,441 M and K dwarfs based on the Kepler star properties catalog from Mathur et al. (2017), whose rotation periods have been studied by Santos e… view at source ↗

**Figure 2.** Figure 2: Left panel - classification result for a test set of 200 stars. The algorithm has been trained with 600 stars that were visually classified before-hand. CP stands for classical pulsator, norot for MS non-rotating star, Prot for MS rotating star, RG for red giant. The real class of an element corresponds to its column label, the class assignated by the classifier corresponds to the line label. Accuracy of t… view at source ↗

**Figure 3.** Figure 3: Left panel: filter-choice result for a test set of 2166 rotating MS stars. The algorithm was trained with 12,275 stars. The total number of stars is 14,441. Classifier accuracy is 0.927. The true accuracy on the period is 0.974. Right panel: relative importance of each parameter used for the classification. PGWPS, PACF, PCS, HACF, GACF, HCS and Sph values are considered for each filter (20, 55, 80) and con… view at source ↗

**Figure 4.** Figure 4: Example of decision tree designed to classify two-parameters data x = {x1, x2} within two classes y1 and y2 [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

read the original abstract

For a solar-like star, the surface rotation evolves with time, allowing in principle to estimate the age of a star from its surface rotation period. Here we are interested in measuring surface rotation periods of solar-like stars observed by the NASA Kepler mission. Different methods have been developed to track rotation signals in Kepler photometric light curves: time-frequency analysis based on wavelet techniques, autocorrelation and composite spectrum. We use the learning abilities of random forest classifiers to take decisions during two crucial steps of the analysis. First, given some input parameters, we discriminate the considered Kepler targets between rotating MS stars, non-rotating MS stars, red giants, binaries and pulsators. We then use a second classifier only on the MS rotating targets to decide the best data-analysis treatment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper outlines a two-stage random forest pipeline to classify Kepler targets and pick rotation analysis methods, but the abstract supplies zero validation metrics or performance numbers.

read the letter

The paper applies random forests first to sort Kepler stars into rotating main-sequence, non-rotating main-sequence, red giants, binaries, and pulsators, then uses a second classifier on the rotating main-sequence subset to choose among wavelet, autocorrelation, or composite-spectrum methods for period measurement. This automation of the decision steps is the concrete addition to earlier manual or rule-based pipelines that already used those three techniques. The logic is clear and the goal of scaling to the full Kepler sample is reasonable. The description stays grounded in the existing literature on stellar rotation signals without overclaiming a fundamental advance. The main gap is the complete absence of any reported results: no accuracy figures, no cross-validation, no feature list, and no test against known labels or human classifications. Without those, the claim that the classifiers reliably identify categories and select the best method rests on an untested assumption about the training data. The paper is aimed at groups already working on large photometric rotation catalogs who might want an automated front end. A reader who needs a working, validated tool will not get it from this version. I would not send it for peer review until the validation section is added and the performance is shown.

Referee Report

1 major / 0 minor

Summary. The paper proposes using random forest classifiers in a two-stage pipeline to analyze Kepler photometric light curves of solar-like stars. The first classifier uses unspecified input parameters to categorize targets as rotating main-sequence (MS) stars, non-rotating MS stars, red giants, binaries, or pulsators. The second classifier, applied only to the rotating MS subset, selects the optimal data-analysis treatment (e.g., wavelet, autocorrelation, or composite spectrum) for measuring surface rotation periods.

Significance. If the classifiers can be shown to perform reliably, the approach could automate key decision steps in rotation-period extraction for large samples, improving consistency and scalability for gyrochronology studies. The manuscript does not yet supply the quantitative validation needed to establish this utility.

major comments (1)

[Abstract / Methods] Abstract and method description: No details are provided on training-set construction, label provenance, feature selection, cross-validation, accuracy/precision/recall metrics, confusion matrices, or feature-importance rankings for either random forest classifier. Without these, the central claim that the classifiers can reliably discriminate stellar categories and select the objectively best analysis method remains untested.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We address the major comment below and agree that additional details on the classifiers are needed to strengthen the paper.

read point-by-point responses

Referee: [Abstract / Methods] Abstract and method description: No details are provided on training-set construction, label provenance, feature selection, cross-validation, accuracy/precision/recall metrics, confusion matrices, or feature-importance rankings for either random forest classifier. Without these, the central claim that the classifiers can reliably discriminate stellar categories and select the objectively best analysis method remains untested.

Authors: We agree that the manuscript would benefit from expanded details on both random forest classifiers. In the revised version, we will add a dedicated subsection in the Methods describing: (i) training-set construction and label provenance (sourced from existing Kepler catalogs and visual verification), (ii) the full list of input features and selection criteria, (iii) the cross-validation procedure, and (iv) quantitative performance metrics including accuracy, precision, recall, confusion matrices, and feature-importance rankings. These additions will allow readers to evaluate the reliability of the two-stage pipeline. revision: yes

Circularity Check

0 steps flagged

No circularity: standard supervised classification with no self-referential derivations or load-bearing self-citations.

full rationale

The paper applies random forest classifiers to two classification tasks (stellar category discrimination and choice of analysis treatment) using input parameters and labels. No equations, derivations, or predictions are presented that reduce by construction to fitted inputs. No self-citations are invoked as uniqueness theorems or to smuggle ansatzes. The method is a standard ML pipeline whose validity depends on external training data quality rather than internal definitional loops. This matches the default expectation of no circularity for most papers.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that input parameters and training labels are reliable enough for accurate classification; no free parameters, invented entities, or additional axioms are specified in the abstract.

axioms (1)

domain assumption Input parameters supplied to the classifiers are sufficient to distinguish rotating MS stars, non-rotating MS stars, red giants, binaries, and pulsators.
Stated in the abstract as the basis for the first classifier; no details on parameter selection or label accuracy provided.

pith-pipeline@v0.9.0 · 5684 in / 1260 out tokens · 24969 ms · 2026-05-25T17:51:48.897153+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · 1 internal anchor

[1]

, " * write output.state after.block = add.period write newline

ENTRY address author booktitle chapter edition editor howpublished institution journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sentence := #2 '...

work page
[2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in " " * FUNCTION format....

work page
[3]

J., Koch , D., Basri , G., et al

Borucki , W. J., Koch , D., Basri , G., et al. 2010, Science, 327, 977

work page 2010
[4]

A., Davies , G

Bugnet , L., Garc \' a , R. A., Davies , G. R., et al. 2018, , 620, A38

work page 2018
[5]

A., Mathur , S., et al

Bugnet , L., Garc \' a , R. A., Mathur , S., et al. 2019, , 624, A79

work page 2019
[6]

2017, , 605, A111

Ceillier , T., Tayar , J., Mathur , S., et al. 2017, , 605, A111

work page 2017
[7]

A., et al

Ceillier , T., van Saders , J., Garc \' a , R. A., et al. 2016, , 456, 119

work page 2016
[8]

Signature of a magnetic activity cycle in HD49933 observed by CoRoT

Garc \' a , R. A., Ballot , J., Mathur , S., Salabert , D., & Regulo , C. 2010, arXiv e-prints, arXiv:1012.0494

work page internal anchor Pith review Pith/arXiv arXiv 2010
[9]

A., Ceillier , T., Salabert , D., et al

Garc \' a , R. A., Ceillier , T., Salabert , D., et al. 2014, , 572, A34

work page 2014
[10]

A., Hekker , S., Stello , D., et al

Garc \' a , R. A., Hekker , S., Stello , D., et al. 2011, , 414, L6

work page 2011
[11]

A., Mathur , S., Pires , S., et al

Garc \' a , R. A., Mathur , S., Pires , S., et al. 2014, , 568, A10

work page 2014
[12]

A., Ballot , J., et al

Mathur , S., Garc \' a , R. A., Ballot , J., et al. 2014, , 562, A124

work page 2014
[13]

A., R \'e gulo , C., et al

Mathur , S., Garc \' a , R. A., R \'e gulo , C., et al. 2010, , 511, A46

work page 2010
[14]

M., et al

Mathur , S., Huber , D., Batalha , N. M., et al. 2017, , 229, 30

work page 2017
[15]

2011, Journal of Machine Learning Research, 12, 2825

Pedregosa, F., Varoquaux, G., Gramfort, A., et al. 2011, Journal of Machine Learning Research, 12, 2825

work page 2011
[16]

Santos , A. R. G., García , R. A., Mathur , S., et al. submitted to ApJ,

work page
[17]

1972, , 171, 565

Skumanich , A. 1972, , 171, 565

work page 1972
[18]

L., Ceillier , T., Metcalfe , T

van Saders , J. L., Ceillier , T., Metcalfe , T. S., et al. 2016, , 529, 181

work page 2016
[19]

1992, MNRAS, 301, 257

Bohr, N., Einstein, A., & Fermi, E. 1992, MNRAS, 301, 257

work page 1992
[20]

1991, A&A, 248, 612

Curie, M., & Curie, P. 1991, A&A, 248, 612

work page 1991
[21]

1996, Solar Phys

de Gaulle, C. 1996, Solar Phys. (Oxford Univ. Press, Oxford)

work page 1996
[22]

1926, ApJ, 63, 196 (Paper II)

Einstein, A. 1926, ApJ, 63, 196 (Paper II)

work page 1926
[23]

Kafka, F., Laurel, S., Hardy, O. et al. 1924, A&A, 248, 612

work page 1924
[24]

1994, Active Driking, in The Evolution

Laurel, S., & Hardy, O. 1994, Active Driking, in The Evolution

work page 1994

[1] [1]

, " * write output.state after.block = add.period write newline

ENTRY address author booktitle chapter edition editor howpublished institution journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sentence := #2 '...

work page

[2] [2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in " " * FUNCTION format....

work page

[3] [3]

J., Koch , D., Basri , G., et al

Borucki , W. J., Koch , D., Basri , G., et al. 2010, Science, 327, 977

work page 2010

[4] [4]

A., Davies , G

Bugnet , L., Garc \' a , R. A., Davies , G. R., et al. 2018, , 620, A38

work page 2018

[5] [5]

A., Mathur , S., et al

Bugnet , L., Garc \' a , R. A., Mathur , S., et al. 2019, , 624, A79

work page 2019

[6] [6]

2017, , 605, A111

Ceillier , T., Tayar , J., Mathur , S., et al. 2017, , 605, A111

work page 2017

[7] [7]

A., et al

Ceillier , T., van Saders , J., Garc \' a , R. A., et al. 2016, , 456, 119

work page 2016

[8] [8]

Signature of a magnetic activity cycle in HD49933 observed by CoRoT

Garc \' a , R. A., Ballot , J., Mathur , S., Salabert , D., & Regulo , C. 2010, arXiv e-prints, arXiv:1012.0494

work page internal anchor Pith review Pith/arXiv arXiv 2010

[9] [9]

A., Ceillier , T., Salabert , D., et al

Garc \' a , R. A., Ceillier , T., Salabert , D., et al. 2014, , 572, A34

work page 2014

[10] [10]

A., Hekker , S., Stello , D., et al

Garc \' a , R. A., Hekker , S., Stello , D., et al. 2011, , 414, L6

work page 2011

[11] [11]

A., Mathur , S., Pires , S., et al

Garc \' a , R. A., Mathur , S., Pires , S., et al. 2014, , 568, A10

work page 2014

[12] [12]

A., Ballot , J., et al

Mathur , S., Garc \' a , R. A., Ballot , J., et al. 2014, , 562, A124

work page 2014

[13] [13]

A., R \'e gulo , C., et al

Mathur , S., Garc \' a , R. A., R \'e gulo , C., et al. 2010, , 511, A46

work page 2010

[14] [14]

M., et al

Mathur , S., Huber , D., Batalha , N. M., et al. 2017, , 229, 30

work page 2017

[15] [15]

2011, Journal of Machine Learning Research, 12, 2825

Pedregosa, F., Varoquaux, G., Gramfort, A., et al. 2011, Journal of Machine Learning Research, 12, 2825

work page 2011

[16] [16]

Santos , A. R. G., García , R. A., Mathur , S., et al. submitted to ApJ,

work page

[17] [17]

1972, , 171, 565

Skumanich , A. 1972, , 171, 565

work page 1972

[18] [18]

L., Ceillier , T., Metcalfe , T

van Saders , J. L., Ceillier , T., Metcalfe , T. S., et al. 2016, , 529, 181

work page 2016

[19] [19]

1992, MNRAS, 301, 257

Bohr, N., Einstein, A., & Fermi, E. 1992, MNRAS, 301, 257

work page 1992

[20] [20]

1991, A&A, 248, 612

Curie, M., & Curie, P. 1991, A&A, 248, 612

work page 1991

[21] [21]

1996, Solar Phys

de Gaulle, C. 1996, Solar Phys. (Oxford Univ. Press, Oxford)

work page 1996

[22] [22]

1926, ApJ, 63, 196 (Paper II)

Einstein, A. 1926, ApJ, 63, 196 (Paper II)

work page 1926

[23] [23]

Kafka, F., Laurel, S., Hardy, O. et al. 1924, A&A, 248, 612

work page 1924

[24] [24]

1994, Active Driking, in The Evolution

Laurel, S., & Hardy, O. 1994, Active Driking, in The Evolution

work page 1994