Precision Physical Activity Prescription via Reinforcement Learning for Functional Actions

Gefei Lin; Jennifer Sacheck; Rui Miao; Xiaoke Zhang

arxiv: 2605.19208 · v1 · pith:4LNFAEEHnew · submitted 2026-05-19 · 📊 stat.AP · cs.LG· stat.ML

Precision Physical Activity Prescription via Reinforcement Learning for Functional Actions

Gefei Lin , Rui Miao , Jennifer Sacheck , Xiaoke Zhang This is my paper

Pith reviewed 2026-05-20 02:59 UTC · model grok-4.3

classification 📊 stat.AP cs.LGstat.ML

keywords offline reinforcement learningphysical activitydaily stepspersonalized prescriptionAll of Uscardiometabolic biomarkersfunctional actions

0 comments

The pith

Offline reinforcement learning derives personalized daily step distributions from All of Us data that associate with improved cardiometabolic markers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a new offline reinforcement learning method to recommend optimal patterns of daily steps over time for better health biomarkers. It treats the sequence of daily steps as a functional action and learns policies from months of step counts paired with repeated biomarker measurements in the All of Us dataset. Simulation checks show the approach improves on standard continuous-action reinforcement learning techniques. When applied to the real data, the resulting policy calls for higher step totals and steadier activity day to day, with adjustments according to a person's blood glucose, body mass index, blood pressure, age, and sex.

Core claim

The authors introduce an offline reinforcement learning algorithm designed for functional actions, where each action is a full distribution of daily steps across a time window. Using large-scale observational records that link step counts to cardiometabolic biomarkers, the method learns a policy whose recommended activity patterns are associated with lower risk markers. The learned policy increases total daily steps and reduces day-to-day variability while providing subgroup-specific adjustments for blood glucose level, body mass index, blood pressure, age, and sex.

What carries the argument

Offline reinforcement learning algorithm that learns policies over functional actions representing daily step count distributions, trained on paired step and biomarker trajectories from the All of Us program.

If this is right

The optimal policy generally recommends higher daily step totals than observed in the data.
It favors a steadier, less variable pattern of activity across days.
Recommendations differ for subgroups defined by blood glucose level.
Further tailoring occurs according to body mass index, blood pressure, age, and sex.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same functional-action reinforcement learning setup could be applied to other continuous wearable signals such as heart rate or sleep duration.
Embedding the learned policy in a smartphone app would allow real-time updates as new step and biomarker data arrive.
Direct comparison of the derived policy against current public health step guidelines could quantify how much additional personalization improves outcomes.

Load-bearing premise

The observational All of Us step counts and biomarker records contain enough information for the algorithm to recover activity policies whose effects on cardiometabolic markers are not substantially distorted by unmeasured confounding or selection bias.

What would settle it

A randomized trial that assigns participants to follow the learned step-distribution policy versus usual care and then tracks changes in blood glucose, BMI, and blood pressure would directly test whether the recommended patterns produce the expected biomarker improvements.

Figures

Figures reproduced from arXiv: 2605.19208 by Gefei Lin, Jennifer Sacheck, Rui Miao, Xiaoke Zhang.

**Figure 1.** Figure 1: Data construction outline. Explicitly, the baseline time was operationally defined as the date of the first available glucose measurement, which is the only laboratory-based measurement among the cardiometabolic biomarkers in this analysis. For each participant, we extracted a 990-day observation window starting from baseline and divided it into consecutive 90-day intervals. 6 [PITH_FULL_IMAGE:figures/fu… view at source ↗

**Figure 2.** Figure 2: Daily steps of two subjects from the All of Us data. [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Functional boxplot of learned and behavior PA distributions in the LQD domain. [PITH_FULL_IMAGE:figures/full_fig_p019_3.png] view at source ↗

**Figure 4.** Figure 4: Learned vs. behavioral quantile functions of 90-day average daily step counts. [PITH_FULL_IMAGE:figures/full_fig_p020_4.png] view at source ↗

**Figure 5.** Figure 5: Learned qˆ (red) vs behavior qˆ b (black) within Normal, Borderline, High, and Low glucose subgroups respectively. Solid red and black curves are averaged qˆ and qˆ b respectively [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗

**Figure 6.** Figure 6: Same as Figure [PITH_FULL_IMAGE:figures/full_fig_p024_6.png] view at source ↗

**Figure 7.** Figure 7: Same as Figure [PITH_FULL_IMAGE:figures/full_fig_p024_7.png] view at source ↗

**Figure 8.** Figure 8: Same as Figure [PITH_FULL_IMAGE:figures/full_fig_p025_8.png] view at source ↗

**Figure 9.** Figure 9: Same as Figure [PITH_FULL_IMAGE:figures/full_fig_p026_9.png] view at source ↗

read the original abstract

Physical activity (PA) plays an important role in maintaining and improving health. Daily steps have been a key PA measure that is easily accessible with common wearable devices. However, methods are lacking to recommend a personalized optimal distribution of daily steps over a period of time for the best of certain health biomarkers. In this paper, we fill this void based on the data from the All of Us Research Program which includes months of step counts as well as repeated measurements of key health biomarkers. We develop a new offline reinforcement learning (RL) algorithm to learn personalized and optimal PA distributions associated with cardiometabolic risk, where the action is a function representing the daily step distribution over a period of time. Simulation studies demonstrate the advantage of the proposed approach over existing continuous-action RL methods. The learned optimal policy from the All of Us data generally suggests people take more daily steps and also follow a more consistent pattern of PA over time while offering tailored recommendations for subgroups in blood glucose level, body mass index, blood pressure, age, and sex.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper introduces an offline RL method for functional actions (step distributions over time) and fits it to All of Us data for subgroup PA recommendations, but the observational setup leaves causal claims vulnerable to unmeasured confounding.

read the letter

The main thing here is a new offline reinforcement learning algorithm built for functional actions, where the policy recommends a full distribution of daily steps across a period rather than a single average. They apply it to the All of Us cohort with its repeated step counts and biomarker measures, and the learned policies point toward more steps overall plus more consistent timing, with tailoring by glucose, BMI, blood pressure, age, and sex. Simulations apparently show gains over standard continuous-action RL baselines, which is plausible if the shape of the activity pattern matters for the outcomes. That functional-action framing is the clearest technical step forward and gives the work a distinct angle inside applied RL for health data. The choice of All of Us is reasonable for this kind of repeated-measures problem, and the subgroup breakdowns add some practical flavor that readers in digital preventive medicine might appreciate. The soft spot is the move from observational records to policy recommendations that are meant to improve cardiometabolic markers. The abstract and stress-test note both flag the absence of explicit handling for unmeasured confounding, time-varying selection, or sensitivity checks, and nothing in the provided summary indicates those safeguards are present in the full text. Without them the tailored suggestions could easily capture associations driven by factors like socioeconomic status or health motivation rather than effects that would appear under intervention. The real-data results are described qualitatively with no error bars or robustness numbers, which keeps the support thin. This paper is aimed at researchers working on RL methods for wearable or digital-health interventions. A reader interested in extending continuous-action RL to functional outputs or in seeing All of Us used for policy learning would get concrete ideas to adapt. It is coherent enough on its own terms and addresses a relevant applied question to deserve a serious referee, even though the causal identification section will need strengthening. I would send it out for peer review rather than desk reject.

Referee Report

3 major / 2 minor

Summary. The paper develops a new offline reinforcement learning algorithm that treats the daily step count distribution over a multi-day period as a functional action. It applies this method to observational data from the All of Us Research Program to learn personalized policies that are claimed to optimize cardiometabolic biomarkers (blood glucose, BMI, blood pressure). Simulation studies are reported to show advantages over existing continuous-action RL baselines, and the real-data analysis concludes that the learned policy recommends higher and more consistent step counts with subgroup-specific tailoring by age, sex, and biomarker levels.

Significance. If the offline RL procedure can be shown to recover policies whose value functions reflect causal effects rather than spurious associations, the work would offer a practical framework for precision physical-activity prescriptions that leverage readily available wearable data. The functional-action formulation is a distinctive technical contribution that could be adopted in other longitudinal health-behavior settings.

major comments (3)

[§4.2] §4.2 (Offline RL algorithm and value-function estimation): the procedure learns the policy directly from the observational All of Us trajectories without any described adjustment for unmeasured confounding, time-varying covariates, or selection bias. Because the headline claim is that the resulting policy improves cardiometabolic markers, the absence of confounding control (e.g., via negative controls, instrumental variables, or explicit sensitivity analysis) is load-bearing for the causal interpretation of the subgroup recommendations.
[§5.2] §5.2 (Real-data results): no quantitative performance metrics, confidence intervals, or sensitivity checks are supplied for the All of Us policy; the text only states that the policy “generally suggests” more steps and consistency. This makes it impossible to judge the magnitude or robustness of the claimed biomarker improvements that underpin the personalized recommendations.
[§3] §3 (Simulation studies): while the abstract asserts an advantage over existing continuous-action RL methods, the specific evaluation metrics (regret, value-function difference, or biomarker improvement) and their variability across replications are not reported in sufficient detail to verify that the proposed functional-action approach is meaningfully superior under realistic confounding structures.

minor comments (2)

[§2.1] The notation for the functional action space (step-count distribution over a horizon) is introduced without an explicit mathematical definition or illustrative plot; adding a short equation and example figure would improve readability.
[§4.1] Several biomarker trajectories are described as “repeated measurements” but the exact number of observations per participant and the handling of missingness are not stated; a brief table summarizing the data structure would help.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which have helped us identify areas for improvement in the manuscript. We address each major comment below and indicate the revisions we plan to make.

read point-by-point responses

Referee: [§4.2] §4.2 (Offline RL algorithm and value-function estimation): the procedure learns the policy directly from the observational All of Us trajectories without any described adjustment for unmeasured confounding, time-varying covariates, or selection bias. Because the headline claim is that the resulting policy improves cardiometabolic markers, the absence of confounding control (e.g., via negative controls, instrumental variables, or explicit sensitivity analysis) is load-bearing for the causal interpretation of the subgroup recommendations.

Authors: We agree that the manuscript would benefit from a clearer discussion of the assumptions underlying the causal interpretation of the learned policies. Our approach is based on offline RL applied to observational data, which implicitly relies on the no unmeasured confounding assumption for causal claims. In the revised manuscript, we will expand the methods section to explicitly state these assumptions and add a sensitivity analysis subsection. This will include bounding the policy value under different levels of unmeasured confounding and discussing potential time-varying confounders in the All of Us data context. We believe this will strengthen the interpretation without overclaiming causality. revision: yes
Referee: [§5.2] §5.2 (Real-data results): no quantitative performance metrics, confidence intervals, or sensitivity checks are supplied for the All of Us policy; the text only states that the policy “generally suggests” more steps and consistency. This makes it impossible to judge the magnitude or robustness of the claimed biomarker improvements that underpin the personalized recommendations.

Authors: We acknowledge the lack of quantitative details in the real-data analysis. To address this, we will revise §5.2 to include specific quantitative results, such as the average increase in recommended daily steps (with standard deviations), consistency measures (e.g., variance of daily steps), and estimated improvements in biomarker values under the learned policy. We will also add bootstrap confidence intervals for these estimates and perform sensitivity checks by varying the number of days in the functional action and the RL hyperparameters. These additions will allow for a better assessment of the magnitude and robustness of the findings. revision: yes
Referee: [§3] §3 (Simulation studies): while the abstract asserts an advantage over existing continuous-action RL methods, the specific evaluation metrics (regret, value-function difference, or biomarker improvement) and their variability across replications are not reported in sufficient detail to verify that the proposed functional-action approach is meaningfully superior under realistic confounding structures.

Authors: Thank you for this observation. The simulation studies in §3 compare our functional-action RL method against continuous-action baselines using metrics such as average regret and value function estimates. However, we recognize that more detailed reporting is needed. In the revision, we will include comprehensive tables showing the mean and standard error of these metrics across 100 simulation replications. We will also add experiments under simulated confounding structures to demonstrate performance under realistic conditions, thereby verifying the advantages more rigorously. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is data-driven RL on observational inputs

full rationale

The paper develops an offline RL algorithm whose output (optimal policy for step distributions) is computed directly from the All of Us observational records via the proposed method. No equation or procedure reduces the target result to a quantity defined in terms of itself, nor renames a fitted parameter as a prediction. The abstract and described approach contain no self-citation load-bearing steps, uniqueness theorems imported from prior author work, or ansatzes smuggled via citation. The central claim remains an empirical learning result whose validity hinges on external assumptions about confounding rather than internal definitional equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are stated. The method implicitly relies on standard RL assumptions (Markov property, sufficient state representation) and on the unconfoundedness of the observational data, but these cannot be audited in detail.

pith-pipeline@v0.9.0 · 5711 in / 1277 out tokens · 73575 ms · 2026-05-20T02:59:47.177069+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We develop a new offline reinforcement learning (RL) algorithm to learn personalized and optimal PA distributions... using penalized splines... LQD transformation
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Algorithm 2... policy update... maximizing averaged Q with roughness penalty on second derivative

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages

[1]

Agarwal, N

A. Agarwal, N. Jiang, S. M. Kakade, and W. Sun. Reinforcement learning: Theory and algorithms. Technical report, University of Washington, Seattle, WA, 2019

work page 2019
[2]

Aguiar, N

R. Aguiar, N. Mofid, and H. A. Nam. Exploring optimal control with observations at a cost. arXiv:2006.15757, 2020

work page arXiv 2006
[3]

M. N. Ahmadi, L. F. M. Rezende, G. Ferrari, B. del Pozo Cruz, I.-M. Lee, and E. Sta- matakis. Do the associations of daily steps with mortality and incident cardiovascular disease differ by sedentary time levels? a device-based cohort study.British Journal of Sports Medicine, 58(5):261–268, 2024. 29

work page 2024
[4]

Resources for using Fitbit data

All of Us Research Program. Resources for using Fitbit data. User Support article, 2025

work page 2025
[5]

All of Us

All of Us Research Program Investigators. The “All of Us” research program.The New England Journal of Medicine, 381(7):668–676, 2019

work page 2019
[6]

Antos, C

A. Antos, C. Szepesvári, and R. Munos. Fitted Q-iteration in continuous action-space MDPs. InAdvances in Neural Information Processing Systems 20, pages 9–16, 2007

work page 2007
[7]

Chang, L.-F

Y.-K. Chang, L.-F. Huang, S.-J. Shin, K.-D. Lin, K. Chong, F.-S. Yen, H.-Y. Chang, S.-Y. Chuang, T.-J. Hsieh, C. A. Hsiung, and C.-C. Hsu. A point-based mortality prediction system for older adults with diabetes.Scientific Reports, 7(1):12652, 2017

work page 2017
[8]

Cleven, J

L. Cleven, J. Krell-Roesch, C. R. Nigg, and A. Woll. The association between physical activity with incident obesity, coronary heart disease, diabetes and hypertension in adults: a systematic review of longitudinal studies published after 2012.BMC Public Health, 20:726, 2020

work page 2012
[9]

Physicalactivity and cognitive function in middle-aged and older adults: An analysis of 104,909 people from 20 countries.Mayo Clinic Proceedings, 91(11):1515–1524, 2016

P.deSoutoBarreto, J.Delrieu, S.Andrieu, B.Vellas, andY.Rolland. Physicalactivity and cognitive function in middle-aged and older adults: An analysis of 104,909 people from 20 countries.Mayo Clinic Proceedings, 91(11):1515–1524, 2016

work page 2016
[10]

del Pozo Cruz, M

B. del Pozo Cruz, M. N. Ahmadi, S. L. Naismith, and E. Stamatakis. Association of daily step count and intensity with incident dementia in 78,430 adults living in the UK.JAMA Neurology, 79(10):1059–1063, 2022

work page 2022
[11]

del Pozo Cruz, S

B. del Pozo Cruz, S. J. H. Biddle, P. A. Gardiner, and D. Ding. Light-intensity physical activity and life expectancy: National health and nutrition survey.American Journal of Preventive Medicine, 61(3):428–433, 2021. 30

work page 2021
[12]

Delaigle and P

A. Delaigle and P. Hall. Defining probability density for a distribution of random functions.The Annals of Statistics, 38(2):1171–1193, 2010

work page 2010
[13]

J. M. Desman, Z.-W. Hong, M. Sabounchi, A. S. Sawant, J. Gill, A. C. Costa, G. Ku- mar, R. Sharma, A. Gupta, P. McCarthy, V. Nandwani, D. Powell, A. Carideo, D. Goodwin, S. Ahmed, U. Gidwani, M. A. Levin, R. Varghese, F. Filsoufi, R. Free- man, A. Shetreat-Klein, A. W. Charney, I. Hofer, L. Chan, D. Reich, P. Kovatch, R. Kohli-Seth, M. Kraft, P. Agrawal, ...

work page 2025
[14]

Ernst, P

D. Ernst, P. Geurts, and L. Wehenkel. Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6:503–556, 2005

work page 2005
[15]

S. L. Fleming, K. Jeyapragasan, T. Duan, D. Ding, S. Gombar, N. Shah, and E. Brunskill. Missingness as stability: Understanding the structure of missingness in longitudinal EHR data and its impact on reinforcement learning in healthcare. arXiv:1911.07084, 2019

work page arXiv 1911
[16]

J. L. Gay, D. M. Buchner, and M. D. Schmidt. Dose-response association of physical activity with hba1c: Intensity and bout length.Preventive Medicine, 86:58–63, 2016

work page 2016
[17]

Ghosal, S

R. Ghosal, S. K. Ghosh, J. A. Schrack, and V. Zipunnikov. Distributional outcome regression via quantile functions and its application to modelling continuously moni- tored heart rate and physical activity.Journal of the American Statistical Association, 120(551):1347–1359, 2025

work page 2025
[18]

Jayedi, S

A. Jayedi, S. Soltani, A. Emadi, M.-S. Zargar, and A. Najafi. Aerobic exercise and 31 weight loss in adults: A systematic review and dose-response meta-analysis.JAMA Network Open, 7(12):e2452185, 2024

work page 2024
[19]

Jeong, A

H. Jeong, A. R. Roghanizad, H. Master, J. Kim, A. Kouame, P. A. Harris, M. Basford, K. Marginean, and J. Dunn. Data from the All of Us research program reinforces existence of activity inequality.npj Digital Medicine, 8(1):8, 2025

work page 2025
[20]

G. S. Kimeldorf and G. Wahba. A correspondence between Bayesian estimation on stochastic processes and smoothing by splines.The Annals of Mathematical Statistics, 41(2):495–502, 1970

work page 1970
[21]

W. E. Kraus, K. F. Janz, K. E. Powell, W. W. Campbell, J. M. Jakicic, R. P. Troiano, K. Sprow, A. Torres, K. L. Piercy, and 2018 Physical Activity Guidelines Advisory Committee. Daily step counts for measuring physical activity exposure and its relation to health.Medicine & Science in Sports & Exercise, 51(6):1206–1212, 2019

work page 2018
[22]

Lattimore, M

T. Lattimore, M. Hutter, and P. Sunehag. The sample-complexity of general rein- forcement learning. InProceedings of the 30th International Conference on Machine Learning, PMLR 28, pages 28–36, 2013

work page 2013
[23]

H. M. Le, C. Voloshin, and Y. Yue. Batch policy learning under constraints. In Proceedings of the 36th International Conference on Machine Learning, PMLR 97, pages 3703–3712, 2019

work page 2019
[24]

P. Liao, K. Greenewald, P. Klasnja, and S. Murphy. Personalized heartsteps: A reinforcement learning algorithm for optimizing physical activity.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4(1):18, 2020

work page 2020
[25]

Long and X

Z. Long and X. Zhang. Learning causal effect of physical activity distribution: An 32 application of functional treatment effect estimation with unmeasured confounding. Journal of Applied Statistics, 52(14):2759–2776, 2025

work page 2025
[26]

Matabuena and A

M. Matabuena and A. Petersen. Distributional data analysis of accelerometer data from the NHANES database using nonparametric survey regression models.Journal of the Royal Statistical Society: Series C (Applied Statistics), 72(2):294–313, 2023

work page 2023
[27]

Neumann and J

G. Neumann and J. Peters. Fitted Q-iteration by advantage weighted regression. In Advances in Neural Information Processing Systems 21, pages 1177–1184, 2008

work page 2008
[28]

A. E. Paluch, S. Bajpai, D. R. Bassett, M. R. Carnethon, U. Ekelund, K. R. Evenson, D. A. Galuska, et al. Daily steps and all-cause mortality: A meta-analysis of 15 international cohorts.The Lancet Public Health, 7(3):e219–e228, 2022

work page 2022
[29]

Pan, A.-M

Y. Pan, A.-M. Farahmand, M. White, S. Nabi, P. Grover, and D. Nikovski. Rein- forcement learning with function-valued action spaces for partial differential equation control. InProceedings of the 35th International Conference on Machine Learning, PMLR 80, pages 3986–3995, 2018

work page 2018
[30]

Petersen and H.-G

A. Petersen and H.-G. Müller. Functional data analysis for density functions by trans- formation to a Hilbert space.The Annals of Statistics, 44(1):183–218, 2016

work page 2016
[31]

Pini and S

A. Pini and S. Vantini. Interval-wise testing for functional data.Journal of Nonpara- metric Statistics, 29(2):407–424, 2017

work page 2017
[32]

Z. P. Rostron, R. A. Green, M. Kingsley, and A. Zacharias. Associations between measures of physical activity and muscle size and strength: A systematic review. Archives of Rehabilitation Research and Clinical Translation, 3(2):100124, 2021. 33

work page 2021
[33]

Strain, S

T. Strain, S. Flaxman, R. Guthold, E. Semenova, M. Cowan, L. M. Riley, F. C. Bull, G. A. Stevens, and Country Data Author Group. National, regional, and global trends in insufficient physical activity among adults from 2000 to 2022: A pooled analysis of 507 population-based surveys with 5.7 million participants.The Lancet Global Health, 12(8):e1232–e1243, 2024

work page 2000
[34]

Sun and M

Y. Sun and M. G. Genton. Functional boxplots.Journal of Computational and Graph- ical Statistics, 20(2):316–334, 2011

work page 2011
[35]

Tudor-Locke and D

C. Tudor-Locke and D. R. Bassett. How many steps/day are enough? preliminary pedometer indices for public health.Sports Medicine, 34(1):1–8, 2004

work page 2004
[36]

Tudor-Locke, C

C. Tudor-Locke, C. L. Craig, W. J. Brown, S. A. Clemes, K. De Cocker, B. Giles-Corti, Y. Hatano, S. Inoue, S. M. Matsudo, N. Mutrie, J.-M. Oppert, D. A. Rowe, M. D. Schmidt, G. M. Schofield, J. C. Spence, P. J. Teixeira, M. A. Tully, and S. N. Blair. How many steps/day are enough? for adults.International Journal of Behavioral Nutrition and Physical Activ...

work page 2011
[37]

how many steps are enough?

C. Tudor-Locke, Y. Hatano, R. P. Pangrazi, and M. Kang. Revisiting “how many steps are enough?”.Medicine & Science in Sports & Exercise, 40(7 Suppl):S537–S543, 2008

work page 2008
[38]

A review of causal estimation of effects in mediation analyses

M. Uehara, C. Shi, and N. Kallus. A review of off-policy evaluation in reinforcement learning. arXiv:2212.06355, 2022

work page arXiv 2022
[39]

Department of Health and Human Services.Physical Activity Guidelines for Americans

U.S. Department of Health and Human Services.Physical Activity Guidelines for Americans. U.S. Department of Health and Human Services, Washington, DC, 2 edition, 2018. 34

work page 2018
[40]

J. Wang, R. K. W. Wong, X. Zhang, and K. C. G. Chan. Flexible functional treatment effect estimation.Journal of Machine Learning Research, 27(16):1–48, 2026

work page 2026
[41]

World Health Organization, Geneva, 2020

World Health Organization.WHO Guidelines on Physical Activity and Sedentary Behaviour. World Health Organization, Geneva, 2020

work page 2020
[42]

Physical activity

World Health Organization. Physical activity. Fact sheet, 2024. 35

work page 2024

[1] [1]

Agarwal, N

A. Agarwal, N. Jiang, S. M. Kakade, and W. Sun. Reinforcement learning: Theory and algorithms. Technical report, University of Washington, Seattle, WA, 2019

work page 2019

[2] [2]

Aguiar, N

R. Aguiar, N. Mofid, and H. A. Nam. Exploring optimal control with observations at a cost. arXiv:2006.15757, 2020

work page arXiv 2006

[3] [3]

M. N. Ahmadi, L. F. M. Rezende, G. Ferrari, B. del Pozo Cruz, I.-M. Lee, and E. Sta- matakis. Do the associations of daily steps with mortality and incident cardiovascular disease differ by sedentary time levels? a device-based cohort study.British Journal of Sports Medicine, 58(5):261–268, 2024. 29

work page 2024

[4] [4]

Resources for using Fitbit data

All of Us Research Program. Resources for using Fitbit data. User Support article, 2025

work page 2025

[5] [5]

All of Us

All of Us Research Program Investigators. The “All of Us” research program.The New England Journal of Medicine, 381(7):668–676, 2019

work page 2019

[6] [6]

Antos, C

A. Antos, C. Szepesvári, and R. Munos. Fitted Q-iteration in continuous action-space MDPs. InAdvances in Neural Information Processing Systems 20, pages 9–16, 2007

work page 2007

[7] [7]

Chang, L.-F

Y.-K. Chang, L.-F. Huang, S.-J. Shin, K.-D. Lin, K. Chong, F.-S. Yen, H.-Y. Chang, S.-Y. Chuang, T.-J. Hsieh, C. A. Hsiung, and C.-C. Hsu. A point-based mortality prediction system for older adults with diabetes.Scientific Reports, 7(1):12652, 2017

work page 2017

[8] [8]

Cleven, J

L. Cleven, J. Krell-Roesch, C. R. Nigg, and A. Woll. The association between physical activity with incident obesity, coronary heart disease, diabetes and hypertension in adults: a systematic review of longitudinal studies published after 2012.BMC Public Health, 20:726, 2020

work page 2012

[9] [9]

Physicalactivity and cognitive function in middle-aged and older adults: An analysis of 104,909 people from 20 countries.Mayo Clinic Proceedings, 91(11):1515–1524, 2016

P.deSoutoBarreto, J.Delrieu, S.Andrieu, B.Vellas, andY.Rolland. Physicalactivity and cognitive function in middle-aged and older adults: An analysis of 104,909 people from 20 countries.Mayo Clinic Proceedings, 91(11):1515–1524, 2016

work page 2016

[10] [10]

del Pozo Cruz, M

B. del Pozo Cruz, M. N. Ahmadi, S. L. Naismith, and E. Stamatakis. Association of daily step count and intensity with incident dementia in 78,430 adults living in the UK.JAMA Neurology, 79(10):1059–1063, 2022

work page 2022

[11] [11]

del Pozo Cruz, S

B. del Pozo Cruz, S. J. H. Biddle, P. A. Gardiner, and D. Ding. Light-intensity physical activity and life expectancy: National health and nutrition survey.American Journal of Preventive Medicine, 61(3):428–433, 2021. 30

work page 2021

[12] [12]

Delaigle and P

A. Delaigle and P. Hall. Defining probability density for a distribution of random functions.The Annals of Statistics, 38(2):1171–1193, 2010

work page 2010

[13] [13]

J. M. Desman, Z.-W. Hong, M. Sabounchi, A. S. Sawant, J. Gill, A. C. Costa, G. Ku- mar, R. Sharma, A. Gupta, P. McCarthy, V. Nandwani, D. Powell, A. Carideo, D. Goodwin, S. Ahmed, U. Gidwani, M. A. Levin, R. Varghese, F. Filsoufi, R. Free- man, A. Shetreat-Klein, A. W. Charney, I. Hofer, L. Chan, D. Reich, P. Kovatch, R. Kohli-Seth, M. Kraft, P. Agrawal, ...

work page 2025

[14] [14]

Ernst, P

D. Ernst, P. Geurts, and L. Wehenkel. Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6:503–556, 2005

work page 2005

[15] [15]

S. L. Fleming, K. Jeyapragasan, T. Duan, D. Ding, S. Gombar, N. Shah, and E. Brunskill. Missingness as stability: Understanding the structure of missingness in longitudinal EHR data and its impact on reinforcement learning in healthcare. arXiv:1911.07084, 2019

work page arXiv 1911

[16] [16]

J. L. Gay, D. M. Buchner, and M. D. Schmidt. Dose-response association of physical activity with hba1c: Intensity and bout length.Preventive Medicine, 86:58–63, 2016

work page 2016

[17] [17]

Ghosal, S

R. Ghosal, S. K. Ghosh, J. A. Schrack, and V. Zipunnikov. Distributional outcome regression via quantile functions and its application to modelling continuously moni- tored heart rate and physical activity.Journal of the American Statistical Association, 120(551):1347–1359, 2025

work page 2025

[18] [18]

Jayedi, S

A. Jayedi, S. Soltani, A. Emadi, M.-S. Zargar, and A. Najafi. Aerobic exercise and 31 weight loss in adults: A systematic review and dose-response meta-analysis.JAMA Network Open, 7(12):e2452185, 2024

work page 2024

[19] [19]

Jeong, A

H. Jeong, A. R. Roghanizad, H. Master, J. Kim, A. Kouame, P. A. Harris, M. Basford, K. Marginean, and J. Dunn. Data from the All of Us research program reinforces existence of activity inequality.npj Digital Medicine, 8(1):8, 2025

work page 2025

[20] [20]

G. S. Kimeldorf and G. Wahba. A correspondence between Bayesian estimation on stochastic processes and smoothing by splines.The Annals of Mathematical Statistics, 41(2):495–502, 1970

work page 1970

[21] [21]

W. E. Kraus, K. F. Janz, K. E. Powell, W. W. Campbell, J. M. Jakicic, R. P. Troiano, K. Sprow, A. Torres, K. L. Piercy, and 2018 Physical Activity Guidelines Advisory Committee. Daily step counts for measuring physical activity exposure and its relation to health.Medicine & Science in Sports & Exercise, 51(6):1206–1212, 2019

work page 2018

[22] [22]

Lattimore, M

T. Lattimore, M. Hutter, and P. Sunehag. The sample-complexity of general rein- forcement learning. InProceedings of the 30th International Conference on Machine Learning, PMLR 28, pages 28–36, 2013

work page 2013

[23] [23]

H. M. Le, C. Voloshin, and Y. Yue. Batch policy learning under constraints. In Proceedings of the 36th International Conference on Machine Learning, PMLR 97, pages 3703–3712, 2019

work page 2019

[24] [24]

P. Liao, K. Greenewald, P. Klasnja, and S. Murphy. Personalized heartsteps: A reinforcement learning algorithm for optimizing physical activity.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4(1):18, 2020

work page 2020

[25] [25]

Long and X

Z. Long and X. Zhang. Learning causal effect of physical activity distribution: An 32 application of functional treatment effect estimation with unmeasured confounding. Journal of Applied Statistics, 52(14):2759–2776, 2025

work page 2025

[26] [26]

Matabuena and A

M. Matabuena and A. Petersen. Distributional data analysis of accelerometer data from the NHANES database using nonparametric survey regression models.Journal of the Royal Statistical Society: Series C (Applied Statistics), 72(2):294–313, 2023

work page 2023

[27] [27]

Neumann and J

G. Neumann and J. Peters. Fitted Q-iteration by advantage weighted regression. In Advances in Neural Information Processing Systems 21, pages 1177–1184, 2008

work page 2008

[28] [28]

A. E. Paluch, S. Bajpai, D. R. Bassett, M. R. Carnethon, U. Ekelund, K. R. Evenson, D. A. Galuska, et al. Daily steps and all-cause mortality: A meta-analysis of 15 international cohorts.The Lancet Public Health, 7(3):e219–e228, 2022

work page 2022

[29] [29]

Pan, A.-M

Y. Pan, A.-M. Farahmand, M. White, S. Nabi, P. Grover, and D. Nikovski. Rein- forcement learning with function-valued action spaces for partial differential equation control. InProceedings of the 35th International Conference on Machine Learning, PMLR 80, pages 3986–3995, 2018

work page 2018

[30] [30]

Petersen and H.-G

A. Petersen and H.-G. Müller. Functional data analysis for density functions by trans- formation to a Hilbert space.The Annals of Statistics, 44(1):183–218, 2016

work page 2016

[31] [31]

Pini and S

A. Pini and S. Vantini. Interval-wise testing for functional data.Journal of Nonpara- metric Statistics, 29(2):407–424, 2017

work page 2017

[32] [32]

Z. P. Rostron, R. A. Green, M. Kingsley, and A. Zacharias. Associations between measures of physical activity and muscle size and strength: A systematic review. Archives of Rehabilitation Research and Clinical Translation, 3(2):100124, 2021. 33

work page 2021

[33] [33]

Strain, S

T. Strain, S. Flaxman, R. Guthold, E. Semenova, M. Cowan, L. M. Riley, F. C. Bull, G. A. Stevens, and Country Data Author Group. National, regional, and global trends in insufficient physical activity among adults from 2000 to 2022: A pooled analysis of 507 population-based surveys with 5.7 million participants.The Lancet Global Health, 12(8):e1232–e1243, 2024

work page 2000

[34] [34]

Sun and M

Y. Sun and M. G. Genton. Functional boxplots.Journal of Computational and Graph- ical Statistics, 20(2):316–334, 2011

work page 2011

[35] [35]

Tudor-Locke and D

C. Tudor-Locke and D. R. Bassett. How many steps/day are enough? preliminary pedometer indices for public health.Sports Medicine, 34(1):1–8, 2004

work page 2004

[36] [36]

Tudor-Locke, C

C. Tudor-Locke, C. L. Craig, W. J. Brown, S. A. Clemes, K. De Cocker, B. Giles-Corti, Y. Hatano, S. Inoue, S. M. Matsudo, N. Mutrie, J.-M. Oppert, D. A. Rowe, M. D. Schmidt, G. M. Schofield, J. C. Spence, P. J. Teixeira, M. A. Tully, and S. N. Blair. How many steps/day are enough? for adults.International Journal of Behavioral Nutrition and Physical Activ...

work page 2011

[37] [37]

how many steps are enough?

C. Tudor-Locke, Y. Hatano, R. P. Pangrazi, and M. Kang. Revisiting “how many steps are enough?”.Medicine & Science in Sports & Exercise, 40(7 Suppl):S537–S543, 2008

work page 2008

[38] [38]

A review of causal estimation of effects in mediation analyses

M. Uehara, C. Shi, and N. Kallus. A review of off-policy evaluation in reinforcement learning. arXiv:2212.06355, 2022

work page arXiv 2022

[39] [39]

Department of Health and Human Services.Physical Activity Guidelines for Americans

U.S. Department of Health and Human Services.Physical Activity Guidelines for Americans. U.S. Department of Health and Human Services, Washington, DC, 2 edition, 2018. 34

work page 2018

[40] [40]

J. Wang, R. K. W. Wong, X. Zhang, and K. C. G. Chan. Flexible functional treatment effect estimation.Journal of Machine Learning Research, 27(16):1–48, 2026

work page 2026

[41] [41]

World Health Organization, Geneva, 2020

World Health Organization.WHO Guidelines on Physical Activity and Sedentary Behaviour. World Health Organization, Geneva, 2020

work page 2020

[42] [42]

Physical activity

World Health Organization. Physical activity. Fact sheet, 2024. 35

work page 2024