Recognition: unknown
Feature Anchors for Time-Series Sensor-Based Human Activity Recognition
Pith reviewed 2026-05-07 15:56 UTC · model grok-4.3
The pith
Handcrafted time-series features improve wearable human activity recognition when kept explicit and modulated inside the model.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that treating handcrafted time-series features as feature anchors—explicit intermediate representations that are adjusted in feature space by context-conditioned scale, bias, and gating parameters—produces representations that are both semantically transparent and task-adaptive, yielding higher macro-F1 scores on USC-HAD, Daphnet, MHealth, and PAMAP2 than baselines that either fix the features or rely solely on latent learning.
What carries the argument
The Temporal Conditioning Network (TCNet), which extracts handcrafted TSF anchors and modulates them via predicted scale, bias, and gating values derived from separate time-domain and frequency-domain context encoders applied to raw IMU windows.
If this is right
- Handcrafted TSFs retain discriminative value when kept explicit and modulated rather than treated as fixed preprocessing outputs.
- Gains on the five benchmarks are attributable to anchor guidance and not merely to the addition of a parallel branch.
- Several families of discriminative time-series statistics remain inaccessible to standard latent representations learned directly from raw signals.
- Keeping anchors visible allows the model to adapt them to the classification objective without post-hoc feature selection.
Where Pith is reading between the lines
- The same anchoring principle could be tested on other sensor-based time-series tasks such as gesture recognition or equipment monitoring where statistical features are known to be informative.
- Models built this way may offer improved post-hoc interpretability because the modulated anchors remain traceable to known motion statistics.
- Replacing the handcrafted anchors with learned but still explicitly grouped features might preserve some of the gains while reducing reliance on domain-specific feature engineering.
Load-bearing premise
The observed accuracy gains arise primarily from the explicit anchor modulation mechanism rather than from architectural side effects such as branch fusion or from dataset-specific properties of the chosen handcrafted features.
What would settle it
A controlled experiment on the same benchmarks in which anchor modulation is disabled (replacing predicted scale/bias/gating with identity or fixed values) while keeping all other network components unchanged, and finding no statistically significant drop in mF1, would falsify the claim that modulation of explicit anchors is the key driver.
Figures
read the original abstract
Wearable Human Activity Recognition (HAR) still lacks a representation that is both explicit and adaptable. Handcrafted time-series features (TSFs) capture meaningful motion statistics and remain competitive on standard benchmarks, but they are usually used as fixed preprocessing outputs. Deep models learn adaptable representations directly from raw signals, but those representations are typically latent and difficult to inspect. We address this gap by treating handcrafted TSFs as feature anchors: explicit intermediate representations that remain inside the model and are adjusted by neural context instead of being discarded. We propose the Temporal Conditioning Network for Feature Anchors (TCNet), which extracts handcrafted anchors, encodes complementary time-domain and frequency-domain context from raw IMU windows, and predicts context-conditioned scale, bias, and gating parameters to modulate anchor groups directly in feature space. This design keeps anchor semantics visible while allowing the representation to adapt to the classification objective. Across five HAR benchmarks, TCNet achieves 70.2% mF1 on USC-HAD, 85.1% mF1 on Daphnet, 93.9% mF1 on MHealth, and 94.5% mF1 on PAMAP2. Relative to rTsfNet, it improves by 4.5 points on USC-HAD, 14.6 points on Daphnet, and 6.5 points on MHealth. Ablations show that the gains come primarily from anchor guidance rather than simple branch fusion, and feature-space analyses indicate that several discriminative TSF families are not reliably accessible in standard latent representations. These results suggest that, for HAR, handcrafted TSFs are most useful when they remain explicit and adaptable within the model. The code is available at: https://github.com/ni-x-lab/TCNet-har
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes the Temporal Conditioning Network (TCNet) for wearable sensor-based Human Activity Recognition (HAR). It treats handcrafted time-series features (TSFs) as explicit 'feature anchors' that are kept inside the model and modulated by context-dependent scale, bias, and gating parameters predicted from complementary time-domain and frequency-domain encoders applied to raw IMU windows. Across five benchmarks, TCNet reports mF1 scores of 70.2% (USC-HAD), 85.1% (Daphnet), 93.9% (MHealth), and 94.5% (PAMAP2), with gains over rTsfNet (e.g., +4.5 on USC-HAD) attributed primarily to anchor guidance rather than branch fusion; feature-space analyses suggest certain discriminative TSF families are inaccessible in standard latent representations. The authors conclude that handcrafted TSFs are most useful when kept explicit and adaptable, and release code at https://github.com/ni-x-lab/TCNet-har.
Significance. If the central claim and ablations hold, the work offers a practical hybrid representation for HAR that preserves the interpretability and domain knowledge of handcrafted TSFs while adding neural adaptability, potentially influencing designs that currently favor fully latent deep features. The multi-benchmark evaluation and public code repository are clear strengths that enable reproducibility and extension. The suggestion that explicit anchors can access feature families missed by standard latent spaces, if substantiated, would be a useful empirical observation for the field.
major comments (1)
- [Ablation experiments] Ablations (as summarized in the abstract): The central claim that improvements derive primarily from anchor guidance rather than simple branch fusion rests on the ablation results. However, if the 'simple branch fusion' baseline does not include equivalent time/frequency context encoders or the same parameter budget for predicting scale/bias/gating, the comparison does not cleanly isolate the benefit of keeping anchors explicit. This is load-bearing for the paper's main conclusion and requires a more tightly controlled ablation.
minor comments (2)
- [Experimental evaluation] The manuscript does not report error bars, standard deviations across runs, or statistical significance tests for the mF1 improvements, nor does it detail data splits, preprocessing, or hyperparameter selection procedures.
- [Analysis section] The feature-space analyses that indicate certain TSF families are inaccessible in latent representations would benefit from additional methodological detail (e.g., exact distance metrics, selection criteria for TSF families, and quantitative thresholds).
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. We address the concern regarding the ablation experiments below and have revised the manuscript to provide a more tightly controlled comparison.
read point-by-point responses
-
Referee: [Ablation experiments] Ablations (as summarized in the abstract): The central claim that improvements derive primarily from anchor guidance rather than simple branch fusion rests on the ablation results. However, if the 'simple branch fusion' baseline does not include equivalent time/frequency context encoders or the same parameter budget for predicting scale/bias/gating, the comparison does not cleanly isolate the benefit of keeping anchors explicit. This is load-bearing for the paper's main conclusion and requires a more tightly controlled ablation.
Authors: We appreciate the referee's observation that the ablation must cleanly isolate the contribution of explicit anchor guidance. In the submitted manuscript, the 'simple branch fusion' baseline uses the same time- and frequency-domain encoders as TCNet but fuses their outputs via concatenation with the anchors, without the context-dependent modulation. To address the concern about parameter budget, we have performed an additional controlled ablation in which a comparable number of parameters are used to predict scale, bias, and gating terms that are instead applied to a standard latent representation (i.e., without explicit anchors). The results of this experiment, which we will report in the revised manuscript, continue to show superior performance for the anchor-based modulation, thereby supporting our central claim. We have also expanded the description of all baselines in Section 4.3 to clarify the architectural equivalence. revision: yes
Circularity Check
No circularity: empirical architecture with benchmark results
full rationale
The paper proposes TCNet as a neural architecture that keeps handcrafted time-series features (TSFs) explicit as anchors and modulates them via predicted scale/bias/gating from time/frequency context encoders. All central claims rest on empirical mF1 scores across five standard HAR benchmarks (USC-HAD, Daphnet, MHealth, PAMAP2, etc.) plus ablations that compare against rTsfNet and branch-fusion baselines. No equations, first-principles derivation, or uniqueness theorem is presented that reduces by construction to fitted parameters, self-citations, or renamed inputs. The work is therefore self-contained; reported gains are tested against external datasets and architectural controls rather than being forced by internal definitions.
Axiom & Free-Parameter Ledger
free parameters (1)
- context encoder architecture and training hyperparameters
axioms (1)
- domain assumption Handcrafted time-series features capture meaningful and discriminative motion statistics for HAR
invented entities (1)
-
Feature anchors
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Alireza Abedin, Mahsa Ehsanpour, Qinfeng Shi, Hamid Rezatofighi, and Damith C. Ranasinghe. 2021. Attend and Discriminate: Beyond the State-of-the-Art for Human Activity Recognition Using Wearable Sensors.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies5, 1 (2021), 1:1–1:22. doi:10.1145/3448083
-
[2]
Rida Amin, Eoin Keogh, et al. 2024. Exploring the Applications of Explainability in Wearable Data Analytics: Systematic Literature Review.Journal of Medical Internet Research12, 1 (2024)
2024
-
[3]
Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, Jorge Luis Reyes-Ortiz, et al . 2013. A public domain dataset for human activity recognition using smartphones.. InEsann, Vol. 3. 3–4
2013
-
[4]
Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer Normalization.arXiv preprint arXiv:1607.06450(2016)
work page internal anchor Pith review arXiv 2016
-
[5]
Marc Bachlin, Meir Plotnik, Daniel Roggen, Inbal Maidan, Jeffrey M Hausdorff, Nir Giladi, and Gerhard Troster. 2009. Wearable assistant for Parkinson’s disease patients with the freezing of gait symptom.IEEE Transactions on Information Technology in Biomedicine14, 2 (2009), 436–446
2009
-
[6]
Hausdorff, Nir Giladi, and Gerhard Tröster
Marc Bächlin, Meir Plotnik, Daniel Roggen, Inbal Maidan, Jeffrey M. Hausdorff, Nir Giladi, and Gerhard Tröster. 2010. Wearable Assistant for Parkinson’s Disease Patients With the Freezing of Gait Symptom.IEEE Transactions on Information Technology in Biomedicine14, 2 (2010), 436–446
2010
-
[7]
Oresti Banos, Rafael Garcia, Juan A Holgado-Terriza, Miguel Damas, Hector Pomares, Ignacio Rojas, Alejandro Saez, and Claudia Villalonga. 2014. mHealthDroid: a novel framework for agile development of mobile health applications. InInternational workshop on ambient assisted living. Springer, 91–98
2014
-
[8]
Marius Bock, Michael Moeller, and Kristof Van Laerhoven. 2024. Temporal action localization for inertial-based human activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies8, 4 (2024), 1–19
2024
-
[9]
Andreas Bulling, Ulf Blanke, and Bernt Schiele. 2014. A Tutorial on Human Activity Recognition Using Body-worn Inertial Sensors.Comput. Surveys46, 3 (2014), 1–33
2014
-
[10]
Kempa-Liehr
Maximilian Christ, Nils Braun, Julius Neuffer, and Andreas W. Kempa-Liehr. 2018. Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python package).Neurocomputing307 (2018), 72–77
2018
-
[11]
Nidhi Dua, Shiva Nand Singh, Vijay Bhaskar Semwal, and Sravan Kumar Challa. 2023. Inception inspired CNN-GRU hybrid network for human activity recognition.Multimedia Tools and Applications82, 4 (2023), 5369–5403
2023
-
[12]
Vincent Dumoulin, Jonathon Shlens, and Manjunath Kudlur. 2017. A Learned Representation For Artistic Style. InProceedings of the International Conference on Learning Representations
2017
-
[13]
Sannara Ek, François Portet, and Philippe Lalanda. 2023. Transformer-based models to deal with heterogeneous environments in human activity recognition.Personal and Ubiquitous Computing27, 6 (2023), 2267–2280. 24 R. Yao et al
2023
-
[14]
Ziqi Gao, Yuntao Wang, Jianguo Chen, Junliang Xing, Shwetak Patel, Xin Liu, and Yuanchun Shi. 2023. MMTSA: Multi-modal temporal segment attention network for efficient human activity recognition.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies7, 3 (2023), 1–26
2023
-
[15]
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 580–587
2014
-
[16]
Yu Guan and Thomas Plötz. 2017. Ensembles of Deep LSTM Learners for Activity Recognition Using Wearables.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies1, 2 (2017), 11:1–11:28. doi:10.1145/3090076
-
[17]
Hammerla, Shane Halloran, and Thomas Plötz
Nils Y. Hammerla, Shane Halloran, and Thomas Plötz. 2016. Deep, Convolutional, and Recurrent Models for Human Activity Recognition Using Wearables. InProceedings of the International Joint Conference on Artificial Intelligence. 1533–1540
2016
-
[18]
Harish Haresamudram, Chi Ian Tang, Sungho Suh, Paul Lukowicz, and Thomas Plötz. 2025. Past, Present, and Future of Sensor-based Human Activity Recognition Using Wearables: A Surveying Tutorial on a Still Challenging Task.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies9, 2 (2025), 34:1–34:44. doi:10.1145/3729467
-
[19]
Raul Igual, Carlos Medrano, and Inmaculada Plaza. 2013. Challenges, Issues and Trends in Fall Detection Systems.BioMedical Engineering OnLine 12, 1 (2013), 66
2013
-
[20]
Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the International Conference on Machine Learning. 448–456
2015
-
[21]
Nikita Kitaev, Łukasz Kaiser, and Anselm Levskaya. 2020. Reformer: The Efficient Transformer. InProceedings of the International Conference on Learning Representations
2020
-
[22]
Lara and Miguel A
Oscar D. Lara and Miguel A. Labrador. 2013. A Survey on Human Activity Recognition Using Wearable Sensors.IEEE Communications Surveys & Tutorials15, 3 (2013), 1192–1209
2013
-
[23]
Liu, and Schahram Dustdar
Shizhan Liu, Hang Yu, Cong Liao, Jianguo Li, Weiyao Lin, Alex X. Liu, and Schahram Dustdar. 2022. Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting. InProceedings of the International Conference on Learning Representations
2022
-
[24]
Yong Liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, and Mingsheng Long. 2024. iTransformer: Inverted Transformers Are Effective for Time Series Forecasting. InProceedings of the International Conference on Learning Representations
2024
-
[25]
Limeng Lu, Chuanlin Zhang, Kai Cao, Tao Deng, and Qianqian Yang. 2022. A multichannel CNN-GRU model for human activity recognition.IEEE Access10 (2022), 66797–66810
2022
-
[26]
Wenjun Ma, Haoran Jing, Zhiwen Yu, and Bin Guo. 2019. AttnSense: Multi-level Attention Mechanism For Multimodal Human Activity Recognition. InProceedings of the International Joint Conference on Artificial Intelligence. 3109–3115
2019
-
[27]
Shenghuan Miao, Ling Chen, and Rong Hu. 2023. Spatial-Temporal Masked Autoencoder for Multi-Device Wearable Human Activity Recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies7, 4 (2023), 172:1–172:25. doi:10.1145/3631415
-
[28]
Shenghuan Miao, Ling Chen, Rong Hu, and Yingsong Luo. 2022. Towards a Dynamic Inter-Sensor Correlations Learning Framework for Multi- Sensor-Based Wearable Human Activity Recognition.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies6, 3 (2022), 130:1–130:25. doi:10.1145/3550331
-
[29]
Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam
Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. 2023. A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. InProceedings of the International Conference on Learning Representations
2023
-
[30]
Francisco Javier Ordóñez and Daniel Roggen. 2016. Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition.Sensors16, 1 (2016), 115
2016
-
[31]
Lingfeng Peng, Luyu Chen, Zhiwen Ye, and Yi Zhang. 2018. AROMA: A Deep Multi-Task Learning Based Simple and Complex Human Activity Recognition Method Using Wearable Sensors.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies2, 2 (2018), 74:1–74:16. doi:10.1145/3214277
-
[32]
Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, and Aaron Courville. 2018. FiLM: Visual Reasoning with a General Conditioning Layer. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 32
2018
-
[33]
Hammerla, and Patrick Olivier
Thomas Plötz, Nils Y. Hammerla, and Patrick Olivier. 2011. Feature Learning for Activity Recognition in Ubiquitous Computing. InProceedings of the International Joint Conference on Artificial Intelligence. 1729–1734
2011
-
[34]
Preece, John Yannis Goulermas, Laurence P
Stephen J. Preece, John Yannis Goulermas, Laurence P. J. Kenney, and David Howard. 2009. A Comparison of Feature Extraction Methods for the Classification of Dynamic Activities From Accelerometer Data.IEEE Transactions on Biomedical Engineering56, 3 (2009), 871–879
2009
-
[35]
Attila Reiss and Didier Stricker. 2012. Introducing a new benchmarked dataset for activity monitoring. In2012 16th international symposium on wearable computers. IEEE, 108–109
2012
-
[36]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. InAdvances in Neural Information Processing Systems, Vol. 28
2015
-
[37]
Rui Shao, Hao Wang, and Shuochao Yao. 2023. ConvBoost: Boosting ConvNets for Sensor-based Activity Recognition.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies7, 1 (2023), 33:1–33:26. doi:10.1145/3580897
-
[38]
Muhammad Shoaib, Stephan Bosch, Ozlem Durmaz Incel, Hans Scholten, and Paul J. M. Havinga. 2015. A Survey of Online Activity Recognition Using Mobile Phones.Sensors15, 1 (2015), 2059–2085
2015
-
[39]
Jie Su, Fengtong Ge, Zhenyu Wen, Taotao Li, Yang Bai, Yejian Zhou, and Xiaoqin Zhang. 2025. IMUZero: Zero-Shot Human Activity Recognition by Language-Based Cross Modality Fusion.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies9, 4 (2025), 211:1–211:28. Feature Anchors for Time-Series Sensor-Based Human Activity Recogniti...
-
[40]
Gomez, Łukasz Kaiser, and Illia Polosukhin
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You Need. InAdvances in Neural Information Processing Systems, Vol. 30
2017
-
[41]
Huiqiang Wang, Jian Peng, Feihu Huang, Jince Wang, Junhui Chen, and Yifei Xiao. 2023. MICN: Multi-scale Local and Global Context Modeling for Long-term Series Forecasting. InProceedings of the International Conference on Learning Representations
2023
-
[42]
Jindong Wang, Yiqiang Chen, Shuji Hao, Xiaohui Peng, and Lisha Hu. 2019. Deep Learning for Sensor-based Activity Recognition: A Survey.Pattern Recognition Letters119 (2019), 3–11
2019
-
[43]
Sheng Wen and Eno Lab. 2024. rTsfNet: A DNN Model with Multi-head 3D Rotation and Time Series Feature Extraction for IMU-based Human Activity Recognition.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies8, 4 (2024)
2024
-
[44]
Haixu Wu, Tengge Hu, Yong Liu, Hang Zhou, Jianmin Wang, and Mingsheng Long. 2023. TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis. InProceedings of the International Conference on Learning Representations
2023
-
[45]
Zechen Yang et al. 2025. SensorLLM: Aligning Large Language Models with Motion Sensors for Human Activity Recognition. InProceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)
2025
-
[46]
Shuochao Yao, Shaohan Hu, Yiran Zhao, Aston Zhang, and Tarek Abdelzaher. 2017. DeepSense: A Unified Deep Learning Framework for Time-Series Mobile Sensing Data Processing. InProceedings of the 26th International Conference on World Wide Web. 351–360
2017
-
[47]
Creagh, Catherine Tong, David A
Hang Yuan, Shing Chan, Andrew P. Creagh, Catherine Tong, David A. Clifton, and Aiden Doherty. 2024. Self-supervised Learning for Human Activity Recognition Using 700,000 Person-days of Wearable Data.npj Digital Medicine7 (2024), 91
2024
-
[48]
Ailing Zeng, Muxi Chen, Lei Zhang, and Qiang Xu. 2023. Are Transformers Effective for Time Series Forecasting?. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 37
2023
-
[49]
Mi Zhang and Alexander A Sawchuk. 2012. USC-HAD: A daily activity dataset for ubiquitous activity recognition using wearable sensors. In Proceedings of the 2012 ACM conference on ubiquitous computing. 1036–1043
2012
- [50]
-
[51]
Ye Zhang, Longguang Wang, Huiling Chen, Aosheng Tian, Shilin Zhou, and Yulan Guo. 2022. IF-ConvTransformer: A Framework for Human Activity Recognition Using IMU Fusion and ConvTransformer.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 2 (2022), 88:1–88:26. doi:10.1145/3534584
-
[52]
Yunhao Zhang and Junchi Yan. 2023. Crossformer: Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting. InProceedings of the International Conference on Learning Representations
2023
-
[53]
Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai Zhang. 2021. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 35
2021
-
[54]
Tian Zhou, Ziqing Ma, Qingsong Wen, Liang Sun, Terrance Yardley, Xue Wang, and Rong Jin. 2022. FiLM: Frequency improved Legendre Memory Model for Long-term Time Series Forecasting. InAdvances in Neural Information Processing Systems, Vol. 35
2022
-
[55]
Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, and Rong Jin. 2022. FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting. InProceedings of the International Conference on Machine Learning
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.