Activity2Vec: Learning ADL Embeddings from Sensor Data with a Sequence-to-Sequence Model
Pith reviewed 2026-05-24 22:46 UTC · model grok-4.3
The pith
A sequence-to-sequence model learns universal embeddings from sensor data for activities of daily living.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a sequence-to-sequence architecture can be trained on unlabeled or partially labeled sensor sequences to produce embeddings that serve as drop-in features for activity-of-daily-living recognition and for fall detection, thereby eliminating hand-crafted feature engineering while also enabling semi-supervised learning on the same data sources.
What carries the argument
The sequence-to-sequence model that encodes variable-length sensor time series into fixed embeddings for downstream classification.
If this is right
- Activity recognition systems no longer require separate feature-engineering pipelines for each new sensor type or environment.
- Partially labeled sensor collections can be exploited for training by treating the learned embeddings as input to a downstream supervised head.
- The same embedding space can be reused for fall detection without redesigning input representations.
Where Pith is reading between the lines
- The embeddings could serve as a starting representation for other time-series health tasks such as sleep-stage detection or gait analysis if the universality assumption holds.
- Deployment in real homes would require checking whether the learned features remain stable when sensor placement or sampling rates change slightly from the training distribution.
Load-bearing premise
The embeddings produced by the sequence-to-sequence model remain meaningful and useful when transferred to new but similar sensor-based recognition tasks without retraining or task-specific adjustments.
What would settle it
Train the model on one sensor dataset, freeze the embeddings, then measure whether a simple classifier using those embeddings matches or exceeds the accuracy of models built with hand-engineered features on a second, independent activity-recognition or fall-detection dataset collected from different users or sensor placements.
Figures
read the original abstract
Recognizing activities of daily living (ADLs) plays an essential role in analyzing human health and behavior. The widespread availability of sensors implanted in homes, smartphones, and smart watches have engendered collection of big datasets that reflect human behavior. To obtain a machine learning model based on these data,researchers have developed multiple feature extraction methods. In this study, we investigate a method for automatically extracting universal and meaningful features that are applicable across similar time series-based learning tasks such as activity recognition and fall detection. We propose creating a sequence-to-sequence (seq2seq) model to perform this feature learning. Beside avoiding feature engineering, the meaningful features learned by the seq2seq model can also be utilized for semi-supervised learning. We evaluate both of these benefits on datasets collected from wearable and ambient sensors.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Activity2Vec, a sequence-to-sequence model for automatically learning embeddings from wearable and ambient sensor time series data representing activities of daily living. The central claims are that the learned features avoid manual engineering, are universal across related tasks such as activity recognition and fall detection, and can support semi-supervised learning; these benefits are stated to have been evaluated on the relevant datasets.
Significance. If the empirical results demonstrate that the seq2seq embeddings match or exceed hand-engineered baselines on multiple tasks while enabling effective semi-supervised performance, the work would offer a practical data-driven alternative to feature engineering in sensor-based ADL analysis.
minor comments (2)
- The abstract states that evaluation was performed on wearable and ambient sensor datasets but supplies no quantitative metrics, dataset sizes, or baseline comparisons; the full manuscript should include these in the results section to allow assessment of the universality claim.
- Notation for the seq2seq architecture (encoder/decoder dimensions, loss function) is not described in the provided abstract; the methods section should define these explicitly with reference to standard seq2seq components.
Simulated Author's Rebuttal
We thank the referee for reviewing our manuscript. The provided summary accurately reflects the proposed Activity2Vec approach and its claimed benefits for ADL embedding learning, universality across tasks, and semi-supervised utility. No specific major comments appear after the 'MAJOR COMMENTS:' heading, which prevents point-by-point addressing of concerns underlying the uncertain recommendation.
- The uncertain recommendation cannot be directly addressed without the specific major comments that motivated it.
Circularity Check
No significant circularity
full rationale
The paper proposes a seq2seq model for automatic feature extraction from sensor time series data for ADL recognition and related tasks, with empirical evaluation on wearable and ambient sensor datasets. No equations, derivations, fitted parameters presented as predictions, or self-citation chains appear in the provided abstract or description of the argument. The central claim is a methodological proposal whose benefits are assessed empirically rather than derived by construction from inputs. The universality claim is framed as an outcome of evaluation, not an untested premise or definitional reduction. This is self-contained against external benchmarks with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Samaneh Aminikhanghahi and Diane J Cook. 2019. Enhancing activity recogni- tion using CPD-based activity segmentation. Pervasive and Mobile Computing 53 (2019), 75–89
work page 2019
-
[2]
Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, and Jorge Luis Reyes-Ortiz. 2013. A public domain dataset for human activity recognition using smartphones.. In ESANN
work page 2013
-
[3]
François Chollet et al. 2015. Keras. https://github.com/fchollet/keras
work page 2015
-
[4]
Cain CT Clark, Claire M Barnes, Gareth Stratton, Melitta A McNarry, Kelly A Mackintosh, and Huw D Summers. 2017. A review of emerging analytical tech- niques for objective physical activity measurement in humans. Sports medicine 47, 3 (2017), 439–447
work page 2017
-
[5]
Diane J Cook, Aaron S Crandall, Brian L Thomas, and Narayanan C Krishnan
-
[6]
CASAS: A smart home in a box. Computer 46, 7 (2013), 62–69
work page 2013
-
[7]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory.Neural computation 9, 8 (1997), 1735–1780
work page 1997
-
[8]
Shehroz S Khan and Babak Taati. 2017. Detecting unseen falls from wearable devices using channel-wise ensemble of autoencoders. Expert Systems with Applications 87 (2017), 280–290
work page 2017
-
[9]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic opti- mization. arXiv preprint arXiv:1412.6980 (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[10]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, Nov (2008), 2579–2605
work page 2008
-
[11]
Mario Munoz-Organero and Ramona Ruiz-Blazquez. 2017. Time-elastic genera- tive model for acceleration time series in human activity recognition. Sensors 17, 2 (2017), 319
work page 2017
-
[12]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cour- napeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830
work page 2011
-
[13]
Rohit Prabhavalkar, Kanishka Rao, Tara N Sainath, Bo Li, Leif Johnson, and Navdeep Jaitly. 2017. A Comparison of Sequence-to-Sequence Models for Speech Recognition.. In Interspeech. 939–943
work page 2017
-
[14]
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. InAdvances in neural information processing systems. 3104– 3112
work page 2014
-
[15]
Moshe Unger, Ariel Bar, Bracha Shapira, and Lior Rokach. 2016. Towards latent context-aware recommendation systems. Knowledge-Based Systems 104 (2016), 165–178
work page 2016
-
[16]
Subhashini Venugopalan, Marcus Rohrbach, Jeffrey Donahue, Raymond Mooney, Trevor Darrell, and Kate Saenko. 2015. Sequence to sequence-video to text. In Proceedings of the IEEE international conference on computer vision . 4534–4542
work page 2015
-
[17]
Aiguo Wang, Guilin Chen, Cuijuan Shang, Miaofei Zhang, and Li Liu. 2016. Human activity recognition in a smart home environment with stacked denoising autoencoders. In International Conference on Web-Age Information Management. Springer, 29–40
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.