Gravity-Aware Hierarchical Routing for Lightweight SensorLLM on Human Activity Recognition

Hao Li; Mingrui Zheng; Yasuyuki Tahara; Yuichi Sei

arxiv: 2606.04019 · v1 · pith:SD66DN4Vnew · submitted 2026-06-01 · 📡 eess.SP · cs.AI

Gravity-Aware Hierarchical Routing for Lightweight SensorLLM on Human Activity Recognition

Hao Li , Mingrui Zheng , Yasuyuki Tahara , Yuichi Sei This is my paper

Pith reviewed 2026-06-28 13:25 UTC · model grok-4.3

classification 📡 eess.SP cs.AI

keywords human activity recognitionSensorLLMhierarchical routinggravity-awarewearable sensorsmodel compressionmacro-F1static activities

0 comments

The pith

A gravity-aware hierarchical routing head improves static class recognition in compact SensorLLM models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that two-stage sensor-language alignment works for large models but causes clear drops in static activity classes when the backbone is compressed to something like TinyLlama. It introduces a lightweight post-alignment routing head that reads per-channel mean and standard deviation from the Chronos tokenizer to extract posture and gravity signals, then softly routes between a static expert and a full expert. A load-balancing loss keeps the routing stable during training. On the MHealth dataset this raises macro-F1 mainly for low-motion classes while dynamic activity scores stay strong and the added cost stays small. Readers would care because it offers a way to keep accurate wearable recognition feasible on resource-limited hardware without full retraining.

Core claim

The gravity-aware hierarchical routing head uses per-channel mean and std from the Chronos tokenizer state to extract statistical cues related to posture and gravity direction, then adaptively combines a static expert and a full expert through soft routing together with a load-balancing loss, producing significant macro-F1 improvement on the MHealth dataset with minimal parameter overhead concentrated on static classes.

What carries the argument

Gravity-aware hierarchical routing head that performs soft routing between a static expert and a full expert using per-channel mean and std from the Chronos tokenizer state.

If this is right

Static activity classes receive accuracy gains without harming dynamic activity performance.
The added parameter count stays minimal enough for compact device deployment.
The head functions as a post-alignment adaptation on already-aligned models.
The load-balancing loss supports stable training of the routing decisions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same statistical cue extraction could be tested on other sensor datasets to check whether gravity-related routing generalizes.
The approach might reduce the need for larger backbones in other compressed multimodal sensor tasks that show static-dynamic imbalances.
Device-level power savings could result if the small overhead permits fully on-edge inference without cloud offload.
Extending the routing to additional expert types might address further failure modes observed in compact sensor models.

Load-bearing premise

The per-channel mean and standard deviation from the Chronos tokenizer state reliably extract statistical cues about posture and gravity direction that support effective soft routing.

What would settle it

Ablating the routing head or replacing its mean/std inputs with random values on the MHealth dataset produces no macro-F1 gain on the static classes.

Figures

Figures reproduced from arXiv: 2606.04019 by Hao Li, Mingrui Zheng, Yasuyuki Tahara, Yuichi Sei.

**Figure 1.** Figure 1: Overview of the proposed gravity-aware hierarchical routing head. The input multichannel sensor window is [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

**Figure 2.** Figure 2: Illustration of the unified evaluation protocol used for all methods in this paper. Each method is trained with [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Per-class F1 comparison on MHealth. Bar heights show the mean F1 over five random seeds for the compact [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: Multi-seed stability comparison between the compact TinyLlama baseline and the proposed gravity-aware [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: Ablation summary on MHealth. Panel A compares the compact baseline, Ablation A, and Ablation B, [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

**Figure 6.** Figure 6: Comprehensive overview of the full experimental results on MHealth, summarizing overall metrics, per-class [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

read the original abstract

Recent studies on sensor-language alignment have shown that two-stage frameworks can improve the semantic modeling ability of wearable-sensor human activity recognition (HAR), where SensorLLM-style methods first perform motion-to-language alignment and then fine-tune the model for downstream tasks. However, our experiments reveal a consistent failure mode when the Stage 2 backbone is compressed to a compact model such as TinyLlama: recognition of dynamic activities remains relatively strong, while the discrimination of low-motion static classes such as standing, sitting, and lying degrades substantially. To address this issue, we propose a gravity-aware hierarchical routing head as a lightweight post-alignment adaptation built on top of an already aligned model, rather than a new large-scale pretraining framework. The method uses the per-channel mean and std from the Chronos tokenizer state to extract statistical cues related to posture and gravity direction, and adaptively combines a static expert and a full expert through soft routing, together with a load-balancing loss for stable training. On the MHealth dataset, this design significantly improves macro-F1 with minimal parameter overhead, and the gains are concentrated mainly on static classes while preserving strong performance on dynamic activities. As a first arXiv disclosure, the current paper reports results on a single dataset only, with the goal of highlighting the core method and laying the groundwork for broader evaluation in future work.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds a lightweight post-alignment routing head that uses tokenizer statistics to fix static-class drops in compressed SensorLLM on one HAR dataset.

read the letter

The main takeaway is a targeted engineering fix for a documented weakness in two-stage sensor-language models. When the Stage 2 backbone is shrunk to TinyLlama size, static activities lose accuracy while dynamic ones stay usable. The authors respond with a gravity-aware hierarchical routing head that sits on top of an already-aligned model, pulls per-channel mean and std from the Chronos tokenizer, and softly routes between a static expert and the full expert plus a load-balancing term.

This is genuinely new in its narrow scope: prior two-stage frameworks are extended by this cheap adaptation rather than another pretraining run. The method description is consistent, the parameter overhead is minimal by design, and the claim that gains concentrate on static classes while preserving dynamic performance follows directly from the routing logic.

The soft spot is the evaluation. Everything rests on MHealth alone, and the abstract supplies no numbers, baselines, error bars, or ablations. The assumption that tokenizer mean/std reliably encode gravity direction is plausible but untested beyond this dataset. The authors themselves flag the single-dataset limit.

The work is for groups already running SensorLLM-style pipelines who need a drop-in way to recover static-class performance without retraining the whole stack. It is coherent on its own terms and the central empirical claim is falsifiable, so it deserves a serious referee even though broader datasets would strengthen it.

Referee Report

3 major / 0 minor

Summary. The paper proposes a gravity-aware hierarchical routing head as a lightweight post-alignment adaptation for compressed SensorLLM models (e.g., TinyLlama) in wearable-sensor human activity recognition. It extracts per-channel mean and standard deviation from the Chronos tokenizer state to drive soft routing between a static expert and a full expert, augmented by a load-balancing loss. The central empirical claim is that this yields significant macro-F1 gains on the MHealth dataset, concentrated on static classes (standing, sitting, lying) while preserving dynamic activity performance, at minimal parameter overhead. The work is presented as a first arXiv disclosure focused on a single dataset.

Significance. If the reported gains hold under scrutiny, the method supplies a practical, low-overhead fix for a documented failure mode in two-stage sensor-language alignment when backbones are compressed. By reusing existing tokenizer statistics rather than introducing new pretraining or heavy modules, it could support more deployable SensorLLM variants on resource-limited hardware. The targeted focus on static vs. dynamic discrimination via gravity cues is a focused contribution within the HAR literature.

major comments (3)

[Abstract] Abstract: the central claim states that the design 'significantly improves macro-F1' on MHealth with gains 'concentrated mainly on static classes,' yet supplies no numerical deltas, baseline models, ablation results, error bars, or statistical tests. This absence makes it impossible to evaluate the magnitude or reliability of the improvement that underpins the paper's contribution.
[Method] Method description: the assumption that per-channel mean and std from the Chronos tokenizer state reliably encode posture/gravity cues sufficient for effective soft routing is load-bearing for the claimed gains, but the manuscript provides no supporting analysis, visualization, or sensitivity test of these statistics. Without this, it remains unclear whether the routing mechanism is doing the intended work or whether gains could arise from other factors.
[Experiments] Experiments: results are confined to a single dataset (MHealth) despite the acknowledged limitation. Given that the failure mode is presented as consistent across compressed backbones, the absence of at least one additional dataset or cross-sensor validation weakens the generality of the conclusion that the routing head addresses the static-class degradation broadly.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments. Below we respond point-by-point to the three major comments, indicating revisions where appropriate while defending the manuscript's scope as an initial disclosure.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim states that the design 'significantly improves macro-F1' on MHealth with gains 'concentrated mainly on static classes,' yet supplies no numerical deltas, baseline models, ablation results, error bars, or statistical tests. This absence makes it impossible to evaluate the magnitude or reliability of the improvement that underpins the paper's contribution.

Authors: We agree that the abstract should be more self-contained. In the revised version we will insert the key quantitative results (macro-F1 deltas on static classes versus the compressed baseline, with reference to the models and ablations already reported in Section 4) while keeping the abstract concise. The full experimental tables, including error bars from multiple runs, remain in the body of the paper. revision: yes
Referee: [Method] Method description: the assumption that per-channel mean and std from the Chronos tokenizer state reliably encode posture/gravity cues sufficient for effective soft routing is load-bearing for the claimed gains, but the manuscript provides no supporting analysis, visualization, or sensitivity test of these statistics. Without this, it remains unclear whether the routing mechanism is doing the intended work or whether gains could arise from other factors.

Authors: We accept this observation. The revised manuscript will include a new subsection (or appendix) with (i) histograms and t-SNE visualizations of the per-channel mean/std statistics stratified by static versus dynamic activities, (ii) a sensitivity analysis showing how perturbations to these statistics affect routing weights, and (iii) an ablation that replaces the tokenizer statistics with random features to confirm that the gravity-related signal is necessary for the observed gains. revision: yes
Referee: [Experiments] Experiments: results are confined to a single dataset (MHealth) despite the acknowledged limitation. Given that the failure mode is presented as consistent across compressed backbones, the absence of at least one additional dataset or cross-sensor validation weakens the generality of the conclusion that the routing head addresses the static-class degradation broadly.

Authors: The manuscript already states explicitly that it is a first arXiv disclosure focused on a single dataset to introduce the core method. Because the failure mode is documented across multiple compressed backbones on MHealth, the immediate contribution is the lightweight fix rather than a broad benchmark. We will strengthen the discussion section to emphasize this scope and to outline planned multi-dataset validation in follow-up work; adding new datasets at this stage would change the paper's stated purpose. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper describes an empirical lightweight adaptation method using per-channel mean/std statistics from an existing Chronos tokenizer to drive soft routing between static and full experts, plus a load-balancing loss. No derivation chain, equations, predictions, or self-citations are presented that reduce any claim to its inputs by construction. Performance gains on MHealth are reported as experimental outcomes with explicit acknowledgment of single-dataset scope. The method is internally consistent and does not rely on fitted parameters renamed as predictions or uniqueness theorems from prior self-work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The method rests on the assumption that tokenizer statistics capture posture cues and that soft routing with load-balancing loss stabilizes training; no free parameters or invented entities are quantified in the abstract.

axioms (1)

domain assumption Per-channel mean and std from Chronos tokenizer state provide usable cues for posture and gravity direction
Directly invoked to drive the routing decision in the proposed head.

invented entities (1)

gravity-aware hierarchical routing head no independent evidence
purpose: Adaptively combine static expert and full expert via soft routing
New component introduced as post-alignment adaptation

pith-pipeline@v0.9.1-grok · 5776 in / 1175 out tokens · 25726 ms · 2026-06-28T13:25:25.921259+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 9 canonical work pages · 3 internal anchors

[1]

Zechen Li, Shohreh Deldari, Linyao Chen, Hao Xue, and Flora D. Salim. Sensorllm: Aligning large language models with motion sensors for human activity recognition. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 354–379. Association for Computational Linguistics, 2025

2025
[2]

Mhealth dataset

Oresti Banos, Rafael García, and Alejandro Saez. Mhealth dataset. UCI Machine Learning Repository, 2014

2014
[3]

Human activity recognition using inertial sensors in a smartphone: An overview.Sensors, 19(14):3213, 2019

Wesllen Sousa Lima, Eduardo Souto, Khalil El-Khatib, Roozbeh Jalali, and Joao Gama. Human activity recognition using inertial sensors in a smartphone: An overview.Sensors, 19(14):3213, 2019. 15 Gravity-Aware Routing for Lightweight SensorLLM Figure 6: Comprehensive overview of the full experimental results on MHealth, summarizing overall metrics, per-clas...

2019
[4]

Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition.Sensors, 16(1):115, 2016

Francisco Javier Ordóñez and Daniel Roggen. Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition.Sensors, 16(1):115, 2016

2016
[5]

TinyLlama: An Open-Source Small Language Model

Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, and Wei Lu. Tinyllama: An open-source small language model. arXiv preprint arXiv:2401.02385, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[6]

Reliable recognition of lying, sitting, and standing with a hip-worn accelerometer.Scandinavian Journal of Medicine & Science in Sports, 28(3):1092–1102, 2018

Henri Vähä-Ypyä, Pauliina Husu, Jaana Suni, Tommi Vasankari, and Harri Sievänen. Reliable recognition of lying, sitting, and standing with a hip-worn accelerometer.Scandinavian Journal of Medicine & Science in Sports, 28(3):1092–1102, 2018

2018
[7]

Ali Heydari, Girish Narayanswamy, Maxwell A

Yuwei Zhang, Kumar Ayush, Siyuan Qiao, A. Ali Heydari, Girish Narayanswamy, Maxwell A. Xu, Ahmed A. Metwally, Shawn Xu, Jake Garrison, Xuhai Xu, Tim Althoff, Yun Liu, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Cecilia Mascolo, Xin Liu, Daniel McDuff, and Yuzhe Yang. Sensorlm: Learning the language of wearable sensors.arXiv preprint arXiv:...

work page arXiv 2025
[8]

Griffin, Lisa Marsch, Michael V

Yuliang Chen, Arvind Pillai, Yu Yvonne Wu, Tess Z. Griffin, Lisa Marsch, Michael V . Heinz, Nicholas C. Jacobson, and Andrew Campbell. Learning transferable sensor models via language-informed pretraining.arXiv preprint arXiv:2603.11950, 2026. 16 Gravity-Aware Routing for Lightweight SensorLLM

work page arXiv 2026
[9]

Collins, Daniel M

Arvind Pillai, Dimitris Spathis, Subigya Nepal, Amanda C. Collins, Daniel M. Mackin, Michael V . Heinz, Tess Z. Griffin, Nicholas C. Jacobson, and Andrew Campbell. Time2lang: Bridging time-series foundation models and large language models for health sensing beyond prompting.arXiv preprint arXiv:2502.07608, 2025

work page arXiv 2025
[10]

Chatts: Aligning time series with llms via synthetic data for enhanced understanding and reasoning.arXiv preprint arXiv:2412.03104, 2024

Zhe Xie, Zeyan Li, Xiao He, Longlong Xu, Xidao Wen, Tieying Zhang, Jianjun Chen, Rui Shi, and Dan Pei. Chatts: Aligning time series with llms via synthetic data for enhanced understanding and reasoning.arXiv preprint arXiv:2412.03104, 2024

work page arXiv 2024
[11]

Scaling wearable foundation models.arXiv preprint arXiv:2410.13638, 2024

Girish Narayanswamy, Xin Liu, Kumar Ayush, Yuzhe Yang, Xuhai Xu, Shun Liao, Jake Garrison, Shyam Tailor, Jake Sunshine, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Samy Abdel-Ghaffar, and Daniel McDuff. Scaling wearable foundation models.arXiv preprint arXiv:2410.13638, 2024

work page arXiv 2024
[12]

Furlotte, Zhun Yang, Chace Lee, Erik Schenck, Yojan Patel, Jian Cui, Logan Douglas Schneider, Robby Bryant, Ryan G

Justin Cosentino, Anastasiya Belyaeva, Xin Liu, Nicholas A. Furlotte, Zhun Yang, Chace Lee, Erik Schenck, Yojan Patel, Jian Cui, Logan Douglas Schneider, Robby Bryant, Ryan G. Gomes, Allen Jiang, Roy Lee, Yun Liu, Javier Perez, Jameson K. Rogers, Cathy Speed, Shyam Tailor, Megan Walker, Jeffrey Yu, Tim Althoff, Conor Heneghan, John Hernandez, Mark Malhotr...

work page arXiv 2024
[13]

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. Out- rageously large neural networks: The sparsely-gated mixture-of-experts layer.arXiv preprint arXiv:1701.06538, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[14]

Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity.Journal of Machine Learning Research, 23(120):1–39, 2022

William Fedus, Barret Zoph, and Noam Shazeer. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity.Journal of Machine Learning Research, 23(120):1–39, 2022

2022
[15]

Holgado-Terriza, Miguel Damas, Hector Pomares, Ignacio Rojas, Alejandro Saez, and Claudia Villalonga

Oresti Banos, Rafael García, Juan A. Holgado-Terriza, Miguel Damas, Hector Pomares, Ignacio Rojas, Alejandro Saez, and Claudia Villalonga. mhealthdroid: A novel framework for agile development of mobile health applications. InInternational Work-Conference on Ambient Assisted Living, pages 91–98. Springer, 2014

2014
[16]

Chronos: Learning the Language of Time Series

Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Sundar Rangapuram, Sebastian Pineda Arango, Shubham Kapoor, Jasper Zschiegner, Danielle C. Maddix, Hao Wang, Michael W. Mahoney, Kari Torkkola, Andrew Gordon Wilson, Michael Bohlke-Schneider, and Yuyang Wang. Chronos: Learning the language ...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[17]

Wearable sensor-based human activity recognition with transformer model.Sensors, 22(5):1911, 2022

Iveta Dirgová Luptáková, Martin Kubovˇcík, and Jiˇrí Pospíchal. Wearable sensor-based human activity recognition with transformer model.Sensors, 22(5):1911, 2022. 17

1911

[1] [1]

Zechen Li, Shohreh Deldari, Linyao Chen, Hao Xue, and Flora D. Salim. Sensorllm: Aligning large language models with motion sensors for human activity recognition. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 354–379. Association for Computational Linguistics, 2025

2025

[2] [2]

Mhealth dataset

Oresti Banos, Rafael García, and Alejandro Saez. Mhealth dataset. UCI Machine Learning Repository, 2014

2014

[3] [3]

Human activity recognition using inertial sensors in a smartphone: An overview.Sensors, 19(14):3213, 2019

Wesllen Sousa Lima, Eduardo Souto, Khalil El-Khatib, Roozbeh Jalali, and Joao Gama. Human activity recognition using inertial sensors in a smartphone: An overview.Sensors, 19(14):3213, 2019. 15 Gravity-Aware Routing for Lightweight SensorLLM Figure 6: Comprehensive overview of the full experimental results on MHealth, summarizing overall metrics, per-clas...

2019

[4] [4]

Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition.Sensors, 16(1):115, 2016

Francisco Javier Ordóñez and Daniel Roggen. Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition.Sensors, 16(1):115, 2016

2016

[5] [5]

TinyLlama: An Open-Source Small Language Model

Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, and Wei Lu. Tinyllama: An open-source small language model. arXiv preprint arXiv:2401.02385, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[6] [6]

Reliable recognition of lying, sitting, and standing with a hip-worn accelerometer.Scandinavian Journal of Medicine & Science in Sports, 28(3):1092–1102, 2018

Henri Vähä-Ypyä, Pauliina Husu, Jaana Suni, Tommi Vasankari, and Harri Sievänen. Reliable recognition of lying, sitting, and standing with a hip-worn accelerometer.Scandinavian Journal of Medicine & Science in Sports, 28(3):1092–1102, 2018

2018

[7] [7]

Ali Heydari, Girish Narayanswamy, Maxwell A

Yuwei Zhang, Kumar Ayush, Siyuan Qiao, A. Ali Heydari, Girish Narayanswamy, Maxwell A. Xu, Ahmed A. Metwally, Shawn Xu, Jake Garrison, Xuhai Xu, Tim Althoff, Yun Liu, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Cecilia Mascolo, Xin Liu, Daniel McDuff, and Yuzhe Yang. Sensorlm: Learning the language of wearable sensors.arXiv preprint arXiv:...

work page arXiv 2025

[8] [8]

Griffin, Lisa Marsch, Michael V

Yuliang Chen, Arvind Pillai, Yu Yvonne Wu, Tess Z. Griffin, Lisa Marsch, Michael V . Heinz, Nicholas C. Jacobson, and Andrew Campbell. Learning transferable sensor models via language-informed pretraining.arXiv preprint arXiv:2603.11950, 2026. 16 Gravity-Aware Routing for Lightweight SensorLLM

work page arXiv 2026

[9] [9]

Collins, Daniel M

Arvind Pillai, Dimitris Spathis, Subigya Nepal, Amanda C. Collins, Daniel M. Mackin, Michael V . Heinz, Tess Z. Griffin, Nicholas C. Jacobson, and Andrew Campbell. Time2lang: Bridging time-series foundation models and large language models for health sensing beyond prompting.arXiv preprint arXiv:2502.07608, 2025

work page arXiv 2025

[10] [10]

Chatts: Aligning time series with llms via synthetic data for enhanced understanding and reasoning.arXiv preprint arXiv:2412.03104, 2024

Zhe Xie, Zeyan Li, Xiao He, Longlong Xu, Xidao Wen, Tieying Zhang, Jianjun Chen, Rui Shi, and Dan Pei. Chatts: Aligning time series with llms via synthetic data for enhanced understanding and reasoning.arXiv preprint arXiv:2412.03104, 2024

work page arXiv 2024

[11] [11]

Scaling wearable foundation models.arXiv preprint arXiv:2410.13638, 2024

Girish Narayanswamy, Xin Liu, Kumar Ayush, Yuzhe Yang, Xuhai Xu, Shun Liao, Jake Garrison, Shyam Tailor, Jake Sunshine, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Samy Abdel-Ghaffar, and Daniel McDuff. Scaling wearable foundation models.arXiv preprint arXiv:2410.13638, 2024

work page arXiv 2024

[12] [12]

Furlotte, Zhun Yang, Chace Lee, Erik Schenck, Yojan Patel, Jian Cui, Logan Douglas Schneider, Robby Bryant, Ryan G

Justin Cosentino, Anastasiya Belyaeva, Xin Liu, Nicholas A. Furlotte, Zhun Yang, Chace Lee, Erik Schenck, Yojan Patel, Jian Cui, Logan Douglas Schneider, Robby Bryant, Ryan G. Gomes, Allen Jiang, Roy Lee, Yun Liu, Javier Perez, Jameson K. Rogers, Cathy Speed, Shyam Tailor, Megan Walker, Jeffrey Yu, Tim Althoff, Conor Heneghan, John Hernandez, Mark Malhotr...

work page arXiv 2024

[13] [13]

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. Out- rageously large neural networks: The sparsely-gated mixture-of-experts layer.arXiv preprint arXiv:1701.06538, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[14] [14]

Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity.Journal of Machine Learning Research, 23(120):1–39, 2022

William Fedus, Barret Zoph, and Noam Shazeer. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity.Journal of Machine Learning Research, 23(120):1–39, 2022

2022

[15] [15]

Holgado-Terriza, Miguel Damas, Hector Pomares, Ignacio Rojas, Alejandro Saez, and Claudia Villalonga

Oresti Banos, Rafael García, Juan A. Holgado-Terriza, Miguel Damas, Hector Pomares, Ignacio Rojas, Alejandro Saez, and Claudia Villalonga. mhealthdroid: A novel framework for agile development of mobile health applications. InInternational Work-Conference on Ambient Assisted Living, pages 91–98. Springer, 2014

2014

[16] [16]

Chronos: Learning the Language of Time Series

Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Sundar Rangapuram, Sebastian Pineda Arango, Shubham Kapoor, Jasper Zschiegner, Danielle C. Maddix, Hao Wang, Michael W. Mahoney, Kari Torkkola, Andrew Gordon Wilson, Michael Bohlke-Schneider, and Yuyang Wang. Chronos: Learning the language ...

work page internal anchor Pith review Pith/arXiv arXiv 2024

[17] [17]

Wearable sensor-based human activity recognition with transformer model.Sensors, 22(5):1911, 2022

Iveta Dirgová Luptáková, Martin Kubovˇcík, and Jiˇrí Pospíchal. Wearable sensor-based human activity recognition with transformer model.Sensors, 22(5):1911, 2022. 17

1911