pith. sign in

arxiv: 2605.21295 · v1 · pith:OCQWWXCUnew · submitted 2026-05-20 · 💻 cs.LG · cs.AI· cs.HC

TimeSRL: Generalizable Time-Series Behavioral Modeling via Semantic RL-Tuned LLMs -- A Case Study in Mental Health

Pith reviewed 2026-05-21 06:06 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.HC
keywords time-series modelinglarge language modelsreinforcement learningmental health predictionsemantic abstractionsgeneralizationpassive sensingcross-dataset transfer
0
0 comments X

The pith

TimeSRL routes time-series signals through language abstractions and RL tuning to generalize mental health predictions across datasets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces TimeSRL as a two-stage process where an LLM first turns raw passive sensing data into high-level natural language descriptions of behavior. A second stage then predicts anxiety or depression scores from those descriptions alone. Training uses Group Relative Policy Optimization with reinforcement learning from verifiable rewards to shape the abstractions without needing labeled intermediate steps. On benchmarks that hold out entire datasets for testing, this yields lower mean absolute errors than both traditional machine learning and other LLM baselines. The results show the abstractions transfer to new sensing setups without additional fine-tuning on the target data.

Core claim

TimeSRL is a two-stage LLM framework that abstracts raw signals into high-level natural language then predicts behavioral outcomes from these abstractions alone, optimized end-to-end using Group Relative Policy Optimization with Reinforcement Learning from Verifiable Rewards, achieving state-of-the-art performance on cross-cohort generalization benchmarks for mental health prediction.

What carries the argument

The semantic bottleneck that converts raw time-series into natural language abstractions before prediction, aligned end-to-end via RLVR to produce outcome-relevant descriptions.

If this is right

  • The same abstractions support accurate prediction on unseen sensing pipelines without any target-domain fine-tuning.
  • Cross-benchmark transfer performance approaches the level of within-domain training for both anxiety and depression tasks.
  • Mean absolute error drops 3.1 to 10.1 percent versus strong non-LLM baselines and up to 57.6 percent versus prior LLM baselines under rigorous LOSO evaluation.
  • Outcome-aligned abstractions learned via RLVR eliminate the need for gold-standard intermediate annotations during training.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same semantic routing could apply to other longitudinal sensing tasks such as activity recognition or sleep staging where cohort shifts are common.
  • If language abstractions prove reusable, new deployments might require far less labeled target data than current numeric models.
  • The approach suggests a broader pattern: insert an explicit language layer between sensor streams and downstream models to improve robustness to distribution shift.

Load-bearing premise

High-level natural language abstractions of raw signals generalize better across datasets and sensing pipelines than models that operate directly on the numeric time series.

What would settle it

A new leave-one-dataset-out test where a direct numeric time-series model matches or beats TimeSRL on mean absolute error for anxiety or depression would show the semantic route does not deliver the claimed generalization gain.

Figures

Figures reproduced from arXiv: 2605.21295 by Jingping Nie, Lilin Xu, Millie Wu, Qingyu Chen, Subigya Nepal, Xiaofan Jiang, Xin Liu, Xuhai "Orson" Xu, Yuang Fan, Yuzhe Yang, Zhuo Zhang.

Figure 1
Figure 1. Figure 1: Overview of TimeSRL, a two-stage LLM framework for robust longitudinal behavioral time-series modeling, instantiated on behavioral health prediction. While traditional ML models overfit numerical regularities and direct-prediction LLMs struggle with long numeric trajectories, TimeSRL addresses these distribution shift challenges by routing inference through an explicit semantic bottleneck. In Stage 1, it a… view at source ↗
Figure 2
Figure 2. Figure 2: The Two-Stage GRPO Tuning Pipeline for TimeSRL. The proposed architecture uses the same model for both stages. In Stage 1, the TimeSRL-LLM is given a prompt with behavioral data to examine the numerical data and summarize the findings. The model leverages an explicit reasoning process to generate #K semantic abstracted summaries. Next, only the generated summaries are extracted, passing through the semanti… view at source ↗
Figure 3
Figure 3. Figure 3: Example of the Two Stage Prompting used in the task of mental-health prediction. Starting from 14 days of tabular multi-variate time-series behavioral data, TimeSRL first constructs a semantic abstraction prompt in Stage 1 by organizing the data into a structured template, translating system feature names into descriptive labels, and converting raw sensor units into interpretable formats. This prompt guide… view at source ↗
Figure 4
Figure 4. Figure 4: LOSO MAE on GLOBEM and College Experience anxiety and depression prediction. Bars denote mean MAE and error bars denote 95% percentile bootstrap SE. Star annotations report paired bootstrap significance tests versus TimeSRL, indicating significantly higher MAE than TimeSRL, and significance levels are marked as ∗𝑝 < 0.05, ∗∗𝑝 < 0.01, and ∗∗∗𝑝 < 0.001. Across both datasets and tasks, TimeSRL maintains top-t… view at source ↗
Figure 5
Figure 5. Figure 5: MAE reduction from TimeSRL tuning across four LLM backbones on GLOBEM (LOSO). Bars compare direct prompting against the TimeSRL-tuned variant for anxiety and depression; error bars denote standard error. TimeSRL consistently improves every backbone, with relative MAE reductions of 38.4–61.6% across all backbone–task combinations. tested models, the tuned version consistently outperforms direct prompting on… view at source ↗
Figure 6
Figure 6. Figure 6: Cross-benchmark (Cross-BM) transfer results on GLOBEM and College Experience. Each panel evaluates transfer in one target benchmark split after training on the other benchmark; the rightmost bar shows the within-benchmark (In-BM) TimeSRL reference for the same target study. Stars denote statistical significance vs. the Cross-BM reference (paired bootstrap; 𝑝 < 0.05, 𝑝 < 0.01, 𝑝 < 0.001; n.s. = not signific… view at source ↗
Figure 7
Figure 7. Figure 7: Qualitative comparison of intermediate summaries on a 14-day window. Model-generated summaries and predictions are presented alongside highlighted framing sentences (colored text) and annotated compression/preservation patterns (shaded boxes). Both untuned two-stage baselines (GPT-5.0, Qwen3-4B) compress the trajectory into a predominantly concern-heavy narrative, emphasizing salient irregularities while u… view at source ↗
Figure 8
Figure 8. Figure 8: Qualitative comparison on a 14-day CollegeExperience DS3 sample (gold anxiety score = 3). Model-generated sum￾maries and predictions are presented alongside highlighted framing sentences (colored text) and annotated compression/preservation patterns (shaded boxes). Untuned two-stage baselines misconstrue localized irregularities (e.g., the D5/D6 sleep swing and D14 crash) as a pervasive anxiety pattern, re… view at source ↗
Figure 9
Figure 9. Figure 9: Qualitative comparison on a 14-day CollegeExperience DS2 depression sample (gold depression score = 2). Model￾generated summaries and predictions are presented alongside highlighted framing sentences (colored text) and annotated com￾pression/preservation patterns (shaded boxes). The untuned two-stage baselines compress the trajectory into a global decline-and￾withdrawal narrative — GPT-5.0 hedges on confou… view at source ↗
read the original abstract

Longitudinal passive sensing enables continuous health prediction, yet models often fail under cross-dataset distribution shifts. Traditional ML overfits cohort-specific artifacts, while Large Language Models (LLMs) struggle to reason reliably over long, heterogeneous time-series. We introduce TimeSRL, a two-stage LLM framework that routes predictions through an explicit semantic bottleneck. The model first abstracts raw signals into high-level natural language, then predicts behavioral outcomes from these abstractions alone. This forces the model to reason over semantic concepts that we argue generalize better than raw numbers. We optimize this process end-to-end using Group Relative Policy Optimization (GRPO) with Reinforcement Learning from Verifiable Rewards (RLVR), learning outcome-aligned abstractions without gold intermediate annotations. Instantiated on mental-health prediction, TimeSRL achieves state-of-the-art performance on a benchmark designed to stress-test cross-cohort generalization under a rigorous leave-one-dataset-out (LOSO) protocol, reducing mean absolute error (MAE) over strong non-LLM ML and LLM baselines by 3.1--10.1% and 9.5--44.1% for anxiety, and 3.2--9.6% and 27.4--57.6% for depression (all $p$s<0.05). TimeSRL significantly outperforms prior methods in cross-benchmark transfer across different sensing pipelines, rivaling its own within-domain performance without target-domain fine-tuning. These results demonstrate that semantic abstractions are reusable and point to a new direction for generalizable behavior modeling via RL-tuned LLMs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces TimeSRL, a two-stage LLM framework for generalizable time-series behavioral modeling. Raw passive-sensing signals are first abstracted into high-level natural-language descriptions; predictions of behavioral outcomes (anxiety and depression scores) are then made exclusively from these abstractions. The abstraction and prediction stages are optimized end-to-end with Group Relative Policy Optimization (GRPO) under Reinforcement Learning from Verifiable Rewards (RLVR) that use only outcome-level supervision. The method is evaluated on a leave-one-dataset-out (LOSO) benchmark spanning multiple cohorts and sensing pipelines, claiming statistically significant MAE reductions of 3.1–10.1 % versus strong non-LLM ML baselines and 9.5–44.1 % versus prior LLM baselines for anxiety (analogous figures for depression), together with strong cross-benchmark transfer without target-domain fine-tuning.

Significance. If the central claim holds, the work demonstrates that explicit semantic natural-language bottlenecks can yield reusable abstractions that survive cross-cohort and cross-pipeline shifts better than direct numeric modeling, offering a concrete path for LLM-based longitudinal health prediction. The use of verifiable outcome rewards rather than fitted intermediate targets supplies external grounding, and the rigorous LOSO protocol is a methodological strength that directly addresses distribution-shift concerns common in passive-sensing studies.

major comments (2)
  1. [Experiments (LOSO results and ablations)] The central claim is that routing predictions through an explicit semantic natural-language abstraction produces reusable concepts that drive the reported LOSO gains. No ablation is presented that keeps the base LLM and the GRPO/RLVR procedure fixed while removing the semantic bottleneck (i.e., feeding raw numeric series directly to the prediction stage). Without this isolation, the observed 3.1–10.1 % and 9.5–44.1 % MAE reductions cannot be attributed specifically to the semantic abstraction rather than to LLM capacity or RL tuning effects alone.
  2. [Methods and Experimental Setup] Full details of baseline implementations, exact data-exclusion criteria, and the computation of error bars and p-values under the LOSO protocol are not provided. This prevents independent verification that the claimed improvements are free of post-hoc choices or implementation artifacts.
minor comments (2)
  1. [Abstract] The abstract reports improvement ranges (e.g., 3.1--10.1 %) without mapping each endpoint to a specific baseline; a table or explicit listing would improve clarity.
  2. [Notation and Methods] Notation for the semantic abstraction function and the precise reward formulation in RLVR should be defined once and used consistently across sections.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments and for recognizing the potential significance of semantic bottlenecks in generalizable time-series modeling. We address each major comment below and commit to revisions that strengthen the manuscript.

read point-by-point responses
  1. Referee: [Experiments (LOSO results and ablations)] The central claim is that routing predictions through an explicit semantic natural-language abstraction produces reusable concepts that drive the reported LOSO gains. No ablation is presented that keeps the base LLM and the GRPO/RLVR procedure fixed while removing the semantic bottleneck (i.e., feeding raw numeric series directly to the prediction stage). Without this isolation, the observed 3.1–10.1 % and 9.5–44.1 % MAE reductions cannot be attributed specifically to the semantic abstraction rather than to LLM capacity or RL tuning effects alone.

    Authors: We agree that this specific ablation is necessary to isolate the contribution of the semantic natural-language bottleneck from LLM capacity and RL tuning effects. While the manuscript includes comparisons to non-LLM ML baselines (which operate directly on raw numeric features) and prior LLM baselines, it does not hold the base LLM and GRPO/RLVR procedure fixed while bypassing the abstraction stage. We will add this control experiment in the revision: raw numeric time series will be provided directly to the prediction-stage LLM under identical GRPO/RLVR optimization, allowing direct attribution of gains to the semantic abstraction. revision: yes

  2. Referee: [Methods and Experimental Setup] Full details of baseline implementations, exact data-exclusion criteria, and the computation of error bars and p-values under the LOSO protocol are not provided. This prevents independent verification that the claimed improvements are free of post-hoc choices or implementation artifacts.

    Authors: We acknowledge that these implementation details are essential for reproducibility and independent verification. The revised manuscript will include an expanded Methods section and a dedicated appendix providing: (i) exact hyperparameter settings and code-level descriptions for all baselines, (ii) precise data-exclusion criteria applied per cohort and sensing pipeline, and (iii) full specification of how error bars and p-values were computed under the LOSO protocol, including the statistical tests and multiple-comparison corrections used. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation relies on external RLVR rewards and LOSO evaluation

full rationale

The paper's central mechanism routes time-series through an explicit semantic abstraction step, then optimizes the full pipeline end-to-end via GRPO with RLVR. Rewards are defined from verifiable outcome labels (anxiety/depression scores) rather than from the same numeric targets used in final evaluation. The LOSO protocol further separates training and test distributions across datasets and sensing pipelines. No equation or step reduces the claimed generalization advantage to a fitted parameter or self-referential definition inside the paper; the performance deltas are presented as empirical outcomes of this externally grounded optimization. No load-bearing self-citations, uniqueness theorems, or ansatz smuggling appear in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the untested premise that semantic language abstractions are inherently more reusable across distributions than raw numeric features; this is stated as an argument rather than derived from prior evidence.

axioms (1)
  • domain assumption Semantic concepts extracted from raw time-series signals generalize better than raw numeric features across cohorts and sensing pipelines.
    Explicitly argued in the abstract as the reason the two-stage design should outperform direct numeric models.

pith-pipeline@v0.9.0 · 5863 in / 1368 out tokens · 55332 ms · 2026-05-21T06:06:45.550200+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

81 extracted references · 81 canonical work pages · 7 internal anchors

  1. [1]

    early to bed and early to rise

    Saeed Abdullah, Mark Matthews, Elizabeth L. Murnane, Geri Gay, and Tanzeem Choudhury. 2014. Towards circadian computing: "early to bed and early to rise" makes some of us unhealthy and sleep deprived. InProceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp ’14). Association for Computing Machinery, New Y...

  2. [2]

    Adler, Dror Ben-Zeev, Vincent W.-S

    Daniel A. Adler, Dror Ben-Zeev, Vincent W.-S. Tseng, John M. Kane, Rachel Brian, Andrew T. Campbell, Marta Hauser, Emily A. Scherer, and Tanzeem Choudhury. 2020. Predicting Early Warning Signs of Psychotic Relapse From Passive Sensing Data: An Approach Using Encoder-Decoder Neural Networks.JMIR mHealth and uHealth8, 8 (Aug. 2020), e19962. doi:10.2196/19962

  3. [3]

    Adler, Fei Wang, David C

    Daniel A. Adler, Fei Wang, David C. Mohr, and Tanzeem Choudhury. 2022. Machine learning for passive mental health symptom prediction: Generalization across different longitudinal mobile sensing studies.PLOS ONE17, 4 (April 2022), e0266516. doi:10.1371/journal.pone.0266516

  4. [4]

    Iftikhar Ahmed, Anushree Brahmacharimayum, Raja Hashim Ali, Talha Ali Khan, and Muhammad Ovais Ahmad. 2025. Explainable AI for Depression Detection and Severity Classification From Activity Data: Development and Evaluation Study of an Interpretable Framework.JMIR Mental Health 12, 1 (Sept. 2025), e72038. doi:10.2196/72038

  5. [5]

    Rebeka Amin, Simon Schreynemackers, Hannah Oppenheimer, Milica Petrovic, Ulrich Hegerl, and Hanna Reich. 2025. Use of Mobile Sensing Data for Longitudinal Monitoring and Prediction of Depression Severity: Systematic Review.Journal of Medical Internet Research27 (Aug. 2025), e57418. doi:10.2196/57418

  6. [6]

    Puyana, Ryan Kurtz, Tammy Chung, and Anind K

    Sangwon Bae, Denzil Ferreira, Brian Suffoletto, Juan C. Puyana, Ryan Kurtz, Tammy Chung, and Anind K. Dey. 2017. Detecting Drinking Episodes in Young Adults Using Smartphone-based Sensors.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.1, 2 (June 2017), 5:1–5:36. doi:10.1145/3090051 26 Fan et al

  7. [7]

    Andrey Bogomolov, Bruno Lepri, Michela Ferron, Fabio Pianesi, and Alex (Sandy) Pentland. 2014. Daily Stress Recognition from Mobile Phone Data, Weather Conditions and Individual Traits. InProceedings of the 22nd ACM international conference on Multimedia (MM ’14). Association for Computing Machinery, New York, NY, USA, 477–486. doi:10.1145/2647868.2654933

  8. [8]

    Borelli, Yuning Wang, Frances Haofei Li, Lyric N

    Jessica L. Borelli, Yuning Wang, Frances Haofei Li, Lyric N. Russo, Marta Tironi, Ken Yamashita, Elayne Zhou, Jocelyn Lai, Brenda Nguyen, Iman Azimi, Christopher Marcotullio, Sina Labbaf, Salar Jafarlou, Nikil Dutt, and Amir Rahmani. 2025. Detection of Depressive Symptoms in College Students Using Multimodal Passive Sensing Data and Light Gradient Boostin...

  9. [9]

    Mehdi Boukhechba, Philip Chow, Karl Fua, Bethany A Teachman, and Laura E Barnes. 2018. Predicting Social Anxiety From Global Positioning System Traces of College Students: Feasibility Study.JMIR Mental Health5, 3 (July 2018), e10101. doi:10.2196/10101

  10. [10]

    Hello AI

    Carrie J. Cai, Samantha Winter, David Steiner, Lauren Wilcox, and Michael Terry. 2019. "Hello AI": Uncovering the Onboarding Needs of Medical Practitioners for Human-AI Collaborative Decision-Making.Proc. ACM Hum.-Comput. Interact.3, CSCW (Nov. 2019), 104:1–104:24. doi:10.1145/3359206

  11. [11]

    Luca Canzian and Mirco Musolesi. 2015. Trajectories of depression: unobtrusive monitoring of depressive states by means of smartphone mobility traces analysis. InProceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp ’15). Association for Computing Machinery, New York, NY, USA, 1293–1304. doi:10.1145/2750...

  12. [12]

    Villalba, Janine M

    Prerna Chikersal, Afsaneh Doryab, Michael Tumminia, Daniella K. Villalba, Janine M. Dutcher, Xinwen Liu, Sheldon Cohen, Kasey G. Creswell, Jennifer Mankoff, J. David Creswell, Mayank Goel, and Anind K. Dey. 2021. Detecting Depression and Predicting its Onset Using Longitudinal Symptoms Captured by Passive Sensing: A Machine Learning Approach With Robust F...

  13. [13]

    DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, Xiaokang Zhang, Xingkai Yu, Yu Wu, Z. F. Wu, Zhibin Gou, Zhihong Shao, Zhuoshu Li, Ziyi Gao, Aixin Liu, Bing Xue, Bingxuan Wang, Bochao Wu, Bei Feng, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, Damai D...

  14. [14]

    Afsaneh Doryab, Daniella K Villalba, Prerna Chikersal, Janine M Dutcher, Michael Tumminia, Xinwen Liu, Sheldon Cohen, Kasey Creswell, Jennifer Mankoff, John D Creswell, and Anind K Dey. 2019. Identifying Behavioral Phenotypes of Loneliness and Social Isolation with Passive Sensing: Statistical Analysis, Data Mining and Machine Learning of Smartphone and F...

  15. [15]

    Morris, Xuhai "Orson" Xu, Chun-Cheng Chang, Lianhui Qin, Daniel McDuff, Xin Liu, Shwetak Patel, and Vikram Iyer

    Zachary Englhardt, Chengqian Ma, Margaret E. Morris, Xuhai "Orson" Xu, Chun-Cheng Chang, Lianhui Qin, Daniel McDuff, Xin Liu, Shwetak Patel, and Vikram Iyer. 2024. From Classification to Clinical Insights: Towards Analyzing and Reasoning About Mobile and Behavioral Health Data With Large Language Models.Proceedings of the ACM on Interactive, Mobile, Weara...

  16. [16]

    Yuang Fan, Jingping Nie, Xinghua Sun, and Xiaofan Jiang. 2024. Exploring foundation models in detecting concerning daily functioning in psychotherapeutic context based on images from smart home devices. In2024 IEEE International Workshop on Foundation Models for Cyber-Physical Systems & Internet of Things (FMSys). IEEE, 44–49

  17. [17]

    Ali Heydari, Maxwell A

    Ken Gu, Zhihan Zhang, Kate Lin, Yuwei Zhang, Akshay Paruchuri, Hong Yu, Mehran Kazemi, Kumar Ayush, A. Ali Heydari, Maxwell A. Xu, Girish Narayanswamy, Yun Liu, Ming-Zher Poh, Yuzhe Yang, Mark Malhotra, Shwetak Patel, Hamid Palangi, Xuhai Xu, Daniel McDuff, Tim Althoff, and Xin Liu. 2025. RADAR: Benchmarking Language Models on Imperfect Tabular Data. doi:...

  18. [18]

    Harari, Nicholas D

    Gabriella M. Harari, Nicholas D. Lane, Rui Wang, Benjamin S. Crosier, Andrew T. Campbell, and Samuel D. Gosling. 2016. Using Smartphones to Collect Behavioral Data in Psychological Science: Opportunities, Practical Considerations, and Challenges.Perspectives on Psychological Science: A Journal of the Association for Psychological Science11, 6 (Nov. 2016),...

  19. [19]

    Ali Heydari, Ken Gu, Vidya Srinivas, Hong Yu, Zhihan Zhang, Yuwei Zhang, Akshay Paruchuri, Qian He, Hamid Palangi, Nova Hammerquist, Ahmed A

    A. Ali Heydari, Ken Gu, Vidya Srinivas, Hong Yu, Zhihan Zhang, Yuwei Zhang, Akshay Paruchuri, Qian He, Hamid Palangi, Nova Hammerquist, Ahmed A. Metwally, Brent Winslow, Yubin Kim, Kumar Ayush, Yuzhe Yang, Girish Narayanswamy, Maxwell A. Xu, Jake Garrison, Amy Armento Lee, Jenny Vafeiadou, Ben Graef, Isaac R. Galatzer-Levy, Erik Schenck, Andrew Barakat, J...

  20. [20]

    Karen Hovsepian, Mustafa al’Absi, Emre Ertin, Thomas Kamarck, Motohiro Nakajima, and Santosh Kumar. 2015. cStress: towards a gold standard for continuous stress assessment in the mobile environment. InProceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp ’15). Association for Computing Machinery, New Yor...

  21. [21]

    Sheikh Asif Imran, Mohammad Nur Hossain Khan, Subrata Biswas, and Bashima Islam. 2025. LLaSA: A Multimodal LLM for Human Activity Analysis Through Wearable and Smartphone Sensors. doi:10.48550/arXiv.2406.14498 arXiv:2406.14498 [cs]

  22. [22]

    Natasha Jaques, Sara Taylor, Asaph Azaria, Asma Ghandeharioun, Akane Sano, and Rosalind Picard. 2015. Predicting students’ happiness from physiology, phone, mobility, and behavioral data.International Conference on Affective Computing and Intelligent Interaction and workshops : [proceedings]. ACII (Conference)2015 (Sept. 2015), 222–228. doi:10.1109/ACII.2...

  23. [23]

    Time-LLM: Time Series Forecasting by Reprogramming Large Language Models

    Ming Jin, Shiyu Wang, Lintao Ma, Zhixuan Chu, James Y. Zhang, Xiaoming Shi, Pin-Yu Chen, Yuxuan Liang, Yuan-Fang Li, Shirui Pan, and Qingsong Wen. 2024. Time-LLM: Time Series Forecasting by Reprogramming Large Language Models. doi:10.48550/arXiv.2310.01728 arXiv:2310.01728 [cs]

  24. [24]

    James M. Joyce. 2011. Kullback-Leibler Divergence. InInternational Encyclopedia of Statistical Science, Miodrag Lovric (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 720–722. doi:10.1007/978-3-642-04898-2_327

  25. [25]

    Yubin Kim, Xuhai Xu, Daniel McDuff, Cynthia Breazeal, and Hae Won Park. 2024. Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data. doi:10.48550/arXiv.2401.06866 arXiv:2401.06866 [cs]

  26. [26]

    Pang Wei Koh, Thao Nguyen, Yew Siang Tang, Stephen Mussmann, Emma Pierson, Been Kim, and Percy Liang. 2020. Concept Bottleneck Models. doi:10.48550/arXiv.2007.04612 arXiv:2007.04612 [cs]

  27. [27]

    Kroenke, R

    K. Kroenke, R. L. Spitzer, and J. B. Williams. 2001. The PHQ-9: validity of a brief depression severity measure.Journal of General Internal Medicine 16, 9 (Sept. 2001), 606–613. doi:10.1046/j.1525-1497.2001.016009606.x

  28. [28]

    Spitzer, Janet B

    Kurt Kroenke, Robert L. Spitzer, Janet B. W. Williams, and Bernd Löwe. 2009. An ultra-brief screening scale for anxiety and depression: the PHQ-4. Psychosomatics50, 6 (2009), 613–621. doi:10.1176/appi.psy.50.6.613

  29. [29]

    Tulu 3: Pushing Frontiers in Open Language Model Post-Training

    Nathan Lambert, Jacob Morrison, Valentina Pyatkin, Shengyi Huang, Hamish Ivison, Faeze Brahman, Lester James V. Miranda, Alisa Liu, Nouha Dziri, Shane Lyu, Yuling Gu, Saumya Malik, Victoria Graf, Jena D. Hwang, Jiangjiang Yang, Ronan Le Bras, Oyvind Tafjord, Chris Wilhelm, Luca Soldaini, Noah A. Smith, Yizhong Wang, Pradeep Dasigi, and Hannaneh Hajishirzi...

  30. [30]

    Sirui Li, Shuhan Xiao, Mihir Joshi, Ahmed Metwally, Daniel McDuff, Wei Wang, and Yuzhe Yang. 2026. HEARTS: Benchmarking LLM Reasoning on Health Time Series.arXiv preprint arXiv:2603.06638(2026)

  31. [31]

    Zechen Li, Shohreh Deldari, Linyao Chen, Hao Xue, and Flora D. Salim. 2025. SensorLLM: Human-Intuitive Alignment of Multivariate Sensor Data with LLMs for Activity Recognition. doi:10.48550/arXiv.2410.10624 arXiv:2410.10624 [cs]

  32. [32]

    Lane, and Lin Zhong

    Robert LiKamWa, Yunxin Liu, Nicholas D. Lane, and Lin Zhong. 2013. MoodScope: building a mood sensor from smartphone usage patterns. InProceeding of the 11th annual international conference on Mobile systems, applications, and services (MobiSys ’13). Association for Computing Machinery, New York, NY, USA, 389–402. doi:10.1145/2462456.2464449

  33. [33]

    Mack, Alex W

    Dante L. Mack, Alex W. DaSilva, Courtney Rogers, Elin Hedlund, Eilis I. Murphy, Vlado Vojdanovski, Jane Plomp, Weichen Wang, Subigya K. Nepal, Paul E. Holtzheimer, Dylan D. Wagner, Nicholas C. Jacobson, Meghan L. Meyer, Andrew T. Campbell, and Jeremy F. Huckins. 2021. Mental Health and Behavior of College Students During the COVID-19 Pandemic: Longitudina...

  34. [34]

    Lakmal Meegahapola, William Droz, Peter Kun, Amalia de Götzen, Chaitanya Nutakki, Shyam Diwakar, Salvador Ruiz Correa, Donglei Song, Hao Xu, Miriam Bidoglia, George Gaskell, Altangerel Chagnaa, Amarsanaa Ganbold, Tsolmon Zundui, Carlo Caprini, Daniele Miorandi, Alethia Hume, Jose Luis Zarza, Luca Cernuzzi, Ivano Bison, Marcelo Rodas Britez, Matteo Busso, ...

  35. [35]

    Merrill, Akshay Paruchuri, Naghmeh Rezaei, Geza Kovacs, Javier Perez, Yun Liu, Erik Schenck, Nova Hammerquist, Jake Sunshine, Shyam Tailor, Kumar Ayush, Hao-Wei Su, Qian He, Cory Y

    Mike A. Merrill, Akshay Paruchuri, Naghmeh Rezaei, Geza Kovacs, Javier Perez, Yun Liu, Erik Schenck, Nova Hammerquist, Jake Sunshine, Shyam Tailor, Kumar Ayush, Hao-Wei Su, Qian He, Cory Y. McLean, Mark Malhotra, Shwetak Patel, Jiening Zhan, Tim Althoff, Daniel McDuff, and Xin Liu

  36. [36]

    2026), 1143

    Transforming wearable data into personal health insights using large language model agents.Nature Communications17, 1 (Jan. 2026), 1143. doi:10.1038/s41467-025-67922-y

  37. [37]

    Jun-Ki Min, Afsaneh Doryab, Jason Wiese, Shahriyar Amini, John Zimmerman, and Jason I. Hong. 2014. Toss ’n’ turn: smartphone as sleep and sleep quality detector. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’14). Association for Computing Machinery, New York, NY, USA, 477–486. doi:10.1145/2556288.2557220

  38. [38]

    Varun Mishra, Gunnar Pope, Sarah Lord, Stephanie Lewia, Byron Lowens, Kelly Caine, Sougata Sen, Ryan Halter, and David Kotz. 2020. Continuous Detection of Physiological Stress with Commodity Hardware.ACM Trans. Comput. Healthcare1, 2 (April 2020), 8:1–8:30. doi:10.1145/3361562

  39. [39]

    Mohr, Mi Zhang, and Stephen M

    David C. Mohr, Mi Zhang, and Stephen M. Schueller. 2017. Personal Sensing: Understanding Mental Health Using Ubiquitous Sensors and Machine Learning.Annual Review of Clinical Psychology13 (May 2017), 23–47. doi:10.1146/annurev-clinpsy-032816-044949 28 Fan et al

  40. [40]

    Mohr, and Laura Pulkki- Råback

    Isaac Moshe, Yannik Terhorst, Kennedy Opoku Asare, Lasse Bosse Sander, Denzil Ferreira, Harald Baumeister, David C. Mohr, and Laura Pulkki- Råback. 2021. Predicting Symptoms of Depression and Anxiety Using Smartphone and Wearable Data.Frontiers in Psychiatry12 (Jan. 2021). doi:10.3389/fpsyt.2021.625247 Publisher: Frontiers

  41. [41]

    Girish Narayanswamy, Xin Liu, Kumar Ayush, Yuzhe Yang, Xuhai Xu, Shun Liao, Jake Garrison, Shyam Tailor, Jake Sunshine, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Samy Abdel-Ghaffar, and Daniel McDuff. 2024. Scaling Wearable Foundation Models. doi:10.48550/arXiv.2410.13638 arXiv:2410.13638 [cs]

  42. [42]

    HUCKINS, COURTNEY ROGERS, MEGHAN L

    SUBIGYA NEPAL, WENJUN LIU, ARVIND PILLAI, WEICHEN WANG, VLADO VOJDANOVSKI, JEREMY F. HUCKINS, COURTNEY ROGERS, MEGHAN L. MEYER, and ANDREW T. CAMPBELL. 2024. Capturing the College Experience: A Four-Year Mobile Sensing Study of Mental Health, Resilience and Behavior of College Students during the Pandemic.Proceedings of the ACM on interactive, mobile, wea...

  43. [43]

    HEINZ, ASHMITA KUNWAR, EUNSOL SOUL CHOI, XUHAI XU, JOANNA KUC, JEREMY F

    SUBIGYA NEPAL, ARVIND PILLAI, WILLIAM CAMPBELL, TALIE MASSACHI, MICHAEL V. HEINZ, ASHMITA KUNWAR, EUNSOL SOUL CHOI, XUHAI XU, JOANNA KUC, JEREMY F. HUCKINS, JASON HOLDEN, SARAH M. PREUM, COLIN DEPP, NICHOLAS JACOBSON, MARY P. CZERWINSKI, ERIC GRANHOLM, and ANDREW T. CAMPBELL. 2024. MindScape Study: Integrating LLM and Behavioral Sensing for Personalized A...

  44. [44]

    Subigya Nepal, Arvind Pillai, Weichen Wang, Tess Griffin, Amanda C Collins, Michael Heinz, Damien Lekkas, Shayan Mirjafari, Matthew Nemesure, George Price, Nicholas Jacobson, and Andrew Campbell. 2024. MoodCapture: Depression Detection using In-the-Wild Smartphone Images. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI...

  45. [45]

    Jingping Nie, Yanchen Liu, Yigong Hu, Yuanyuting Wang, Stephen Xia, Matthias Preindl, and Xiaofan Jiang. 2021. SPIDERS+: A light-weight, wireless, and low-cost glasses-based wearable platform for emotion sensing and bio-signal acquisition.Pervasive and Mobile Computing75 (2021), 101424

  46. [46]

    Jingping Nie, Hanya (Vera) Shao, Yuang Fan, Qijia Shao, Haoxuan You, Matthias Preindl, and Xiaofan Jiang. 2025. LLM-based Conversational AI Therapist for Daily Functioning Screening and Psychotherapeutic Intervention via Everyday Smart Devices.ACM Trans. Comput. Healthcare(Jan. 2025). doi:10.1145/3712299 Just Accepted

  47. [47]

    Jingping Nie, Minghui Zhao, Stephen Xia, Xinghua Sun, Hanya Shao, Yuang Fan, Matthias Preindl, and Xiaofan Jiang. 2022. Ai therapist for daily functioning assessment and intervention using smart home devices. InProceedings of the 20th ACM Conference on Embedded Networked Sensor Systems. 764–765

  48. [48]

    2026.GPT-5

    OpenAI. 2026.GPT-5. https://openai.com Accessed via ChatGPT interface

  49. [49]

    Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. 2022. Training language models to follow instructions with human fee...

  50. [50]

    Collins, Tess Griffin, Benjamin Buck, Sarah Masud Preum, Trevor Cohen, Nicholas C

    Arvind Pillai, Subigya Kumar Nepal, Weichen Wang, Matthew Nemesure, Michael Heinz, George Price, Damien Lekkas, Amanda C. Collins, Tess Griffin, Benjamin Buck, Sarah Masud Preum, Trevor Cohen, Nicholas C. Jacobson, Dror Ben-Zeev, and Andrew Campbell. 2024. Investigating Generalizability of Speech-based Suicidal Ideation Detection Using Mobile Phones.Proc....

  51. [51]

    Mashfiqui Rabbi, Min Hane Aung, Mi Zhang, and Tanzeem Choudhury. 2015. MyBehavior: automatic personalized health feedback from user behaviors and preferences using smartphones. InProceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp ’15). Association for Computing Machinery, New York, NY, USA, 707–718. d...

  52. [52]

    Yuri Rykov, Thuan-Quoc Thach, Iva Bojic, George Christopoulos, and Josip Car. 2021. Digital Biomarkers for Depression Screening With Wearable Devices: Cross-sectional Study With Machine Learning Modeling.JMIR mHealth and uHealth9, 10 (Oct. 2021), e24872. doi:10.2196/24872

  53. [53]

    Karr, Stephen M

    Sohrab Saeb, Mi Zhang, Christopher J. Karr, Stephen M. Schueller, Marya E. Corden, Konrad P. Kording, and David C. Mohr. 2015. Mobile Phone Sensor Correlates of Depressive Symptom Severity in Daily-Life Behavior: An Exploratory Study.Journal of Medical Internet Research17, 7 (July 2015), e4273. doi:10.2196/jmir.4273

  54. [54]

    Akane Sano and Rosalind W. Picard. 2013. Stress Recognition Using Wearable Sensors and Mobile Phones. InProceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction (ACII ’13). IEEE Computer Society, USA, 671–676. doi:10.1109/ACII.2013.117

  55. [55]

    McHill, Andrew Jk Phillips, Laura K

    Akane Sano, Sara Taylor, Andrew W. McHill, Andrew Jk Phillips, Laura K. Barger, Elizabeth Klerman, and Rosalind Picard. 2018. Identifying Objective Physiological Markers and Modifiable Behaviors for Self-Reported Stress and Mental Health Status Using Wearable Sensors and Mobile Phones: Observational Study.Journal of Medical Internet Research20, 6 (June 20...

  56. [56]

    Rachuri, Cecilia Mascolo, Peter J

    Sandra Servia-Rodríguez, Kiran K. Rachuri, Cecilia Mascolo, Peter J. Rentfrow, Neal Lathia, and Gillian M. Sandstrom. 2017. Mobile Sensing at the Service of Mental Well-being: a Large-scale Longitudinal Study. InProceedings of the 26th International Conference on World Wide Web (WWW ’17). International World Wide Web Conferences Steering Committee, Republ...

  57. [57]

    Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, Y. K. Li, Y. Wu, and Daya Guo. 2024. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models. doi:10.48550/arXiv.2402.03300 arXiv:2402.03300 [cs]

  58. [58]

    Zitao Shuai, Zongzhe Xu, David Yang, Wei Wang, and Yuzhe Yang. 2026. OSF: On Pre-training and Scaling of Sleep Foundation Models.arXiv preprint arXiv:2603.00190(2026). TimeSRL: Generalizable Time-Series Behavioral Modeling via Semantic RL-Tuned LLMs 29

  59. [59]

    Spitzer, Kurt Kroenke, Janet B

    Robert L. Spitzer, Kurt Kroenke, Janet B. W. Williams, and Bernd Löwe. 2006. A brief measure for assessing generalized anxiety disorder: the GAD-7. Archives of Internal Medicine166, 10 (May 2006), 1092–1097. doi:10.1001/archinte.166.10.1092

  60. [60]

    Shaoxiong Sun, Amos A. Folarin, Yuezhou Zhang, Nicholas Cummins, Rafael Garcia-Dias, Callum Stewart, Yatharth Ranjan, Zulqarnain Rashid, Pauline Conde, Petroula Laiou, Heet Sankesara, Faith Matcham, Daniel Leightley, Katie M. White, Carolin Oetzmann, Alina Ivan, Femke Lamers, Sara Siddi, Sara Simblett, Raluca Nica, Aki Rintala, David C. Mohr, Inez Myin-Ge...

  61. [61]

    Qwen Team. 2025. Qwen3 Technical Report. arXiv:2505.09388 [cs.CL] https://arxiv.org/abs/2505.09388

  62. [62]

    Ye Tian, Xiaoyuan Ren, Zihao Wang, Onat Gungor, Xiaofan Yu, and Tajana Rosing. 2025. DailyLLM: Context-Aware Activity Log Generation Using Multi-Modal Sensors and LLMs. doi:10.48550/arXiv.2507.13737 arXiv:2507.13737 [cs] version: 1

  63. [63]

    Tseng, Akane Sano, Dror Ben-Zeev, Rachel Brian, Andrew T

    Vincent W.-S. Tseng, Akane Sano, Dror Ben-Zeev, Rachel Brian, Andrew T. Campbell, Marta Hauser, John M. Kane, Emily A. Scherer, Rui Wang, Weichen Wang, Hongyi Wen, and Tanzeem Choudhury. 2020. Using behavioral rhythms and multi-task learning to predict fine-grained symptoms of schizophrenia.Scientific Reports10, 1 (Sept. 2020), 15100. doi:10.1038/s41598-0...

  64. [64]

    Rui Wang, Min S. H. Aung, Saeed Abdullah, Rachel Brian, Andrew T. Campbell, Tanzeem Choudhury, Marta Hauser, John Kane, Michael Merrill, Emily A. Scherer, Vincent W. S. Tseng, and Dror Ben-Zeev. 2016. CrossCheck: toward passive sensing and detection of mental health changes in people with schizophrenia. InProceedings of the 2016 ACM International Joint Co...

  65. [65]

    Campbell

    Rui Wang, Fanglin Chen, Zhenyu Chen, Tianxing Li, Gabriella Harari, Stefanie Tignor, Xia Zhou, Dror Ben-Zeev, and Andrew T. Campbell. 2014. StudentLife: assessing mental health, academic performance and behavioral trends of college students using smartphones. InProceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computin...

  66. [66]

    Epstein, An Ping, James Fogarty, and Sean A

    Rui Wang, Gabriella Harari, Peilin Hao, Xia Zhou, and Andrew T. Campbell. 2015. SmartGPA: how smartphones can assess and predict academic performance of college students. InProceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp ’15). Association for Computing Machinery, New York, NY, USA, 295–306. doi:10....

  67. [67]

    Rui Wang, Weichen Wang, Min S. H. Aung, Dror Ben-Zeev, Rachel Brian, Andrew T. Campbell, Tanzeem Choudhury, Marta Hauser, John Kane, Emily A. Scherer, and Megan Walsh. 2017. Predicting Symptom Trajectories of Schizophrenia using Mobile Sensing.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.1, 3 (Sept. 2017), 110:1–110:24. doi:10.1145/3130976

  68. [68]

    Huckins, William M

    Rui Wang, Weichen Wang, Alex daSilva, Jeremy F. Huckins, William M. Kelley, Todd F. Heatherton, and Andrew T. Campbell. 2018. Tracking Depression Dynamics in College Students Using Mobile Phone and Wearable Sensing.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.2, 1 (March 2018), 43:1–43:26. doi:10.1145/3191775

  69. [69]

    Xumeng Wen, Zihan Liu, Shun Zheng, Shengyu Ye, Zhirong Wu, Yang Wang, Zhijian Xu, Xiao Liang, Junjie Li, Ziming Miao, Jiang Bian, and Mao Yang. 2025. Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs. doi:10.48550/arXiv.2506.14245 arXiv:2506.14245 [cs]

  70. [70]

    Wuyue Xia, Hanya Shao, Ningxin Kong, Yuang Fan, and Jingping Nie. 2025. The Convergence of Mental Health and AI: A Cross-Disciplinary Survey of Ubiquitous Sensing, LLMs, and Clinical Alignment. doi:10.36227/techrxiv.176521329.92810310/v1

  71. [71]

    Villalba, Janine M

    Xuhai Xu, Prerna Chikersal, Afsaneh Doryab, Daniella K. Villalba, Janine M. Dutcher, Michael J. Tumminia, Tim Althoff, Sheldon Cohen, Kasey G. Creswell, J. David Creswell, Jennifer Mankoff, and Anind K. Dey. 2019. Leveraging Routine Behavior and Contextually-Filtered Features for Depression Detection among College Students.Proc. ACM Interact. Mob. Wearabl...

  72. [72]

    Dutcher, Yasaman S

    Xuhai Xu, Prerna Chikersal, Janine M. Dutcher, Yasaman S. Sefidgar, Woosuk Seo, Michael J. Tumminia, Daniella K. Villalba, Sheldon Cohen, Kasey G. Creswell, J. David Creswell, Afsaneh Doryab, Paula S. Nurius, Eve Riskin, Anind K. Dey, and Jennifer Mankoff. 2021. Leveraging Collaborative-Filtering for Personalized Behavior Modeling: A Case Study of Depress...

  73. [73]

    Kuehn, Jeremy F

    Xuhai Xu, Xin Liu, Han Zhang, Weichen Wang, Subigya Nepal, Yasaman Sefidgar, Woosuk Seo, Kevin S. Kuehn, Jeremy F. Huckins, Margaret E. Morris, Paula S. Nurius, Eve A. Riskin, Shwetak Patel, Tim Althoff, Andrew Campbell, Anind K. Dey, and Jennifer Mankoff. 2023. GLOBEM: Cross-Dataset Generalization of Longitudinal Human Behavior Modeling.Proc. ACM Interac...

  74. [74]

    Dey, and Dakuo Wang

    Xuhai Xu, Bingsheng Yao, Yuanzhe Dong, Saadia Gabriel, Hong Yu, James Hendler, Marzyeh Ghassemi, Anind K. Dey, and Dakuo Wang. 2024. Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.8, 1 (March 2024), 31:1–31:32. doi:10.1145/3643540

  75. [75]

    Morris, Eve Riskin, Jennifer Mankoff, and Anind K

    Xuhai Xu, Han Zhang, Yasaman Sefidgar, Yiyi Ren, Xin Liu, Woosuk Seo, Jennifer Brown, Kevin Kuehn, Mike Merrill, Paula Nurius, Shwetak Patel, Tim Althoff, Margaret E. Morris, Eve Riskin, Jennifer Mankoff, and Anind K. Dey. 2023. GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization. arXiv:2211.02733 [cs.LG] https://ar...

  76. [76]

    Zongzhe Xu, Zitao Shuai, Eideen Mozaffari, Ravi S Aysola, Rajesh Kumar, and Yuzhe Yang. 2026. SleepLM: Natural-Language Intelligence for Human Sleep.arXiv preprint arXiv:2602.23605(2026). 30 Fan et al

  77. [77]

    Yuzhe Yang, Yuan Yuan, Guo Zhang, Hao Wang, Ying-Cong Chen, Yingcheng Liu, Christopher G Tarolli, Daniel Crepeau, Jan Bukartyk, Mithri R Junna, et al. 2022. Artificial intelligence-enabled detection and assessment of Parkinson’s disease using nocturnal breathing signals.Nature Medicine 28, 10 (2022), 2207–2215

  78. [78]

    Tianyi Zhang, Miu Kojima, and Simon D’Alfonso. 2024. AWARE Narrator and the Utilization of Large Language Models to Extract Behavioral Insights from Smartphone Sensing Data. doi:10.48550/arXiv.2411.04691 arXiv:2411.04691 [cs]

  79. [79]

    Ali Heydari, Girish Narayanswamy, Maxwell A

    Yuwei Zhang, Kumar Ayush, Siyuan Qiao, A. Ali Heydari, Girish Narayanswamy, Maxwell A. Xu, Ahmed A. Metwally, Shawn Xu, Jake Garrison, Xuhai Xu, Tim Althoff, Yun Liu, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Cecilia Mascolo, Xin Liu, Daniel McDuff, and Yuzhe Yang. 2025. SensorLM: Learning the Language of Wearable Sensors. doi:10.48550/a...

  80. [80]

    Yuze Zhao, Jintao Huang, Jinghan Hu, Xingjun Wang, Yunlin Mao, Daoze Zhang, Hong Zhang, Zeyinzi Jiang, Zhikai Wu, Baole Ai, Ang Wang, Wen- meng Zhou, and Yingda Chen. 2025. SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning. doi:10.48550/arXiv.2408.05517 arXiv:2408.05517 [cs] version: 4

Showing first 80 references.