pith. sign in

arxiv: 2605.22086 · v1 · pith:2FRCHQQXnew · submitted 2026-05-21 · 💻 cs.CV

GenHAR: Generalizing Cross-domain Human Activity Recognition for Last-mile Delivery

Pith reviewed 2026-05-22 07:38 UTC · model grok-4.3

classification 💻 cs.CV
keywords Human Activity RecognitionCross-domain GeneralizationSensor Data TokenizationDomain-invariant RepresentationsEfficient Attention MechanismLast-mile DeliveryDistribution Shift
0
0 comments X

The pith

GenHAR learns domain-invariant representations from source sensor data alone by tokenizing readings and modeling frequency-channel correlations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

GenHAR addresses the distribution shift problem in human activity recognition by building representations that transfer to new sensor domains without any target-domain data or fine-tuning. The framework converts sensor streams into tokens and explicitly learns correlations across frequency dimensions of the sensor channels, then applies selective masking plus an efficient attention mechanism to maintain both accuracy and low compute cost. On real-world HAR datasets it improves accuracy by 9.97 percent over prior methods while cutting floating-point operations by a factor of 6.4. The same model was deployed at a logistics firm operating in four cities, where it processed 2.15 billion real-time activity detections. These results indicate that source-only training can produce practical, generalizable HAR systems for settings like last-mile delivery.

Core claim

GenHAR mitigates the domain gap in cross-domain human activity recognition by learning domain-invariant sensor representations purely from source-domain data. Its central technical steps are tokenization of the sensor time series followed by explicit modeling of correlations among frequency sensor channel dimensions, combined with selective masking and an efficient attention mechanism that together reduce computational cost while preserving transfer performance.

What carries the argument

Sensor-data tokenization with learned correlations among frequency sensor channel dimensions, plus selective masking and efficient attention.

If this is right

  • HAR models can be trained once on a source domain and deployed directly to new environments without collecting or labeling target data.
  • The 6.4-fold reduction in floating-point operations enables real-time inference on resource-constrained devices used in logistics or wearable monitoring.
  • Systematic evaluation on multiple real-world HAR datasets shows consistent gains in both accuracy and efficiency over existing cross-domain methods.
  • Large-scale deployment at a logistics company across four cities processed 2.15 billion activity detections, confirming operational viability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same source-only training strategy could be tested on other time-series sensing tasks such as predictive maintenance or environmental monitoring where labeled target data are expensive to obtain.
  • Because the method operates on tokenized frequency-channel correlations, it may combine naturally with existing self-supervised pre-training pipelines for sensor data.
  • The efficiency improvements suggest the approach could be further adapted for on-device continual learning without cloud retraining.

Load-bearing premise

That tokenizing sensor data and learning correlations among frequency sensor channel dimensions will produce representations that remain invariant and effective across unseen target domains without any target data or fine-tuning.

What would settle it

Apply the trained GenHAR model to a new sensor placement or hardware type that induces a large distribution shift and measure whether its accuracy remains at least 9 percent above prior methods without using any target samples.

Figures

Figures reproduced from arXiv: 2605.22086 by Baoshen Guo, Desheng Zhang, Guang Yang, Haotian Wang, Tian He, Xiubin Fan, Zelong Li, Zhiqing Hong.

Figure 1
Figure 1. Figure 1: Illustration of performance degradation in on-device [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Frequency space has a smaller distribution shift than [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Amplitude features are more domain-invariant than [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Sensor-wise self-attention improves efficiency. [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: Acceleration ratio increases with the sensor data length. [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Impact of frequency amplitude. w/o Frequency Ours Percentage (%) 0 40 80 UCI F1 Accuracy Shoaib F1 Accuracy Motion F1 Accuracy HHAR F1 Accuracy [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗
Figure 11
Figure 11. Figure 11: Impact of sensor-wise attention. w/o Channel Attention and Frequency Ours Percentage (%) 0 40 80 UCI F1 Accuracy Shoaib F1 Accuracy Motion F1 Accuracy HHAR F1 Accuracy [PITH_FULL_IMAGE:figures/full_fig_p008_11.png] view at source ↗
Figure 13
Figure 13. Figure 13: Attention map comparison. capturing the correlation among sensor channels benefits the gen￾eralization of HAR significantly. Importance of sensor-wise attention and frequency feature. We replace both the frequency feature and the sensor-wise attention with a commonly-used self-attention along the temporal dimension on temporal features, i.e., raw IMU input. Results in [PITH_FULL_IMAGE:figures/full_fig_p0… view at source ↗
read the original abstract

Human Activity Recognition (HAR) has shown remarkable effectiveness in various applications, such as smart healthcare and intelligent manufacturing. However, a major challenge faced by HAR is the distribution shift across different sensor data domains, which often leads to decreased performance when deployed for real-world applications. To address this issue, this paper introduces GenHAR, a novel framework designed to mitigate the domain gap by learning domain-invariant sensor representations. GenHAR aims to enhance the generalization capabilities of HAR on target domains purely with data from the source domain. The key novelty of GenHAR lies in two aspects. Firstly, GenHAR tokenizes sensor data and learns correlations among frequency sensor channel dimensions to improve the robustness of HAR models. Secondly, GenHAR improves the efficiency via selective masking and an efficient attention mechanism. We conduct a systematic analysis of GenHAR by comparing it with state-of-the-art HAR methods on real-world human activity datasets. Results show that GenHAR outperforms state-of-the-art methods by 9.97% in accuracy, and reduces Floating Point Operations by 6.4 times. Moreover, we deploy GenHAR at a leading logistics company in 4 cities, and have detected 2.15 billion real-time activities. We release our code at: https://github.com/Sensor-FoundationModel/GenHAR.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces GenHAR, a framework for cross-domain Human Activity Recognition that learns domain-invariant sensor representations from source-domain data alone. Key components include tokenizing sensor data to capture correlations among frequency sensor channel dimensions, plus selective masking and efficient attention for computational efficiency. It reports outperforming state-of-the-art HAR methods by 9.97% accuracy and 6.4x fewer FLOPs, with a real-world deployment at a logistics company across 4 cities that detected 2.15 billion activities; code is released publicly.

Significance. If the generalization results hold under rigorous evaluation, the work would be significant for practical HAR deployment in settings with domain shifts, such as last-mile delivery. The large-scale real-world deployment provides concrete evidence of utility beyond benchmarks, and the public code release is a clear strength that aids reproducibility.

major comments (2)
  1. [Abstract] Abstract: the reported 9.97% accuracy gain and 6.4x FLOPs reduction are presented without details on baseline implementations, statistical significance tests, cross-domain data splits, or quantitative measures of domain invariance, which are load-bearing for substantiating the central generalization claim.
  2. [Method] Method (implied by abstract description): the claim that tokenizing sensor data and learning frequency-channel correlations produces representations invariant to unseen target domains (without target samples or fine-tuning) lacks an explicit invariance regularizer, adversarial objective, or domain-simulation mechanism; standard HAR shifts (device, placement, physiology) alter frequency statistics, so the sufficiency of the described components needs demonstration.
minor comments (1)
  1. [Abstract] Abstract: consider adding a brief sentence on the number and characteristics of the real-world human activity datasets used in the systematic comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important areas for improving clarity in the abstract and strengthening the justification for domain invariance in the method. We address each point below and indicate revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the reported 9.97% accuracy gain and 6.4x FLOPs reduction are presented without details on baseline implementations, statistical significance tests, cross-domain data splits, or quantitative measures of domain invariance, which are load-bearing for substantiating the central generalization claim.

    Authors: We agree that the abstract, being concise by nature, omits supporting details. In the revised manuscript we have expanded the abstract with a brief clause directing readers to Section 4 for the full experimental protocol. There we specify the exact SOTA baselines (DeepConvLSTM, AttendHAR, and others) with their re-implementations, the cross-domain evaluation using leave-one-city-out splits on the four-city logistics data, paired t-test results confirming statistical significance (p < 0.05) of the 9.97 % accuracy improvement, and quantitative domain-invariance metrics (feature alignment scores) reported in the corresponding tables. These additions directly address the load-bearing elements of the generalization claim without altering the reported numbers. revision: yes

  2. Referee: [Method] Method (implied by abstract description): the claim that tokenizing sensor data and learning frequency-channel correlations produces representations invariant to unseen target domains (without target samples or fine-tuning) lacks an explicit invariance regularizer, adversarial objective, or domain-simulation mechanism; standard HAR shifts (device, placement, physiology) alter frequency statistics, so the sufficiency of the described components needs demonstration.

    Authors: The referee is correct that no explicit regularizer or adversarial term is present. Our design instead relies on the inductive bias introduced by frequency-channel tokenization, which groups correlated frequency components that remain relatively stable under common HAR domain shifts, combined with selective masking that forces the model to reconstruct from partial observations and thereby discourages overfitting to domain-specific frequency statistics. To demonstrate sufficiency we have added an expanded discussion in Section 3.2 together with supporting ablation results (new Table 5) that quantify the performance degradation when the frequency-correlation module is removed, and t-SNE visualizations (new Figure 4) showing improved source-target feature overlap on the real deployment data. While we did not incorporate an adversarial objective (to preserve training stability on resource-constrained sensor streams), the empirical gains across device, placement, and city-level shifts provide concrete evidence that the architectural choices are sufficient for the targeted last-mile delivery scenario. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical gains rest on external benchmarks, not self-referential definitions or fitted predictions

full rationale

The paper presents GenHAR as a framework that tokenizes sensor data to learn frequency-channel correlations plus selective masking and efficient attention for domain-invariant representations. No equations, derivations, or first-principles results are shown that reduce the reported 9.97% accuracy improvement or 6.4x FLOPs reduction to quantities defined by the method's own fitted parameters or self-citations. Claims are validated via direct comparisons against external state-of-the-art HAR methods on real-world datasets and a logistics deployment, satisfying the criterion of being self-contained against external benchmarks. No load-bearing self-citation chains, ansatz smuggling, or renaming of known results appear in the abstract or described contributions.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The framework rests on the domain assumption that sensor signals contain transferable frequency-channel correlations and that standard attention mechanisms can isolate domain-invariant features without target data. No new physical entities are postulated. Hyperparameters for masking and attention are implicit free parameters typical of neural training.

free parameters (1)
  • selective masking ratio and attention hyperparameters
    These control which tokens are masked and how attention is computed; they must be chosen or tuned and directly affect the reported efficiency and accuracy numbers.
axioms (1)
  • domain assumption Sensor time-series data can be tokenized and processed analogously to discrete tokens in language or vision models to reveal domain-invariant correlations.
    Invoked when the paper states that GenHAR tokenizes sensor data and learns correlations among frequency sensor channel dimensions.

pith-pipeline@v0.9.0 · 5780 in / 1426 out tokens · 73275 ms · 2026-05-22T07:38:43.836131+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

90 extracted references · 90 canonical work pages · 4 internal anchors

  1. [1]

    Alireza Abedin, Mahsa Ehsanpour, Qinfeng Shi, Hamid Rezatofighi, and Damith C Ranasinghe. 2021. Attend and discriminate: Beyond the state-of-the-art for human activity recognition using wearable sensors.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies5, 1 (2021), 1–22

  2. [2]

    Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normaliza- tion.arXiv preprint arXiv:1607.06450(2016)

  3. [3]

    Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, and Jennifer Wortman Vaughan. 2010. A theory of learning from different domains. Machine learning79 (2010), 151–175

  4. [4]

    Sejal Bhalla, Mayank Goel, and Rushil Khurana. 2022. IMU2Doppler: Cross-Modal Domain Adaptation for Doppler-Based Activity Recognition Using IMU Data. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.5, 4, Article 145 (dec 2022), 20 pages. https://doi.org/10.1145/3494994

  5. [5]

    E Oran Brigham and RE Morrow. 1967. The fast Fourier transform.IEEE spectrum 4, 12 (1967), 63–70

  6. [6]

    Yize Cai, Baoshen Guo, Flora Salim, and Zhiqing Hong. 2025. Towards Gen- eralizable Human Activity Recognition: A Survey. arXiv:2508.12213 [eess.SP] https://arxiv.org/abs/2508.12213

  7. [7]

    Youngjae Chang, Akhil Mathur, Anton Isopoussu, Junehwa Song, and Fahim Kawsar. 2020. A Systematic Study of Unsupervised Domain Adaptation for Robust Human-Activity Recognition.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.4, 1, Article 39 (mar 2020), 30 pages. https://doi.org/10.1145/3380985

  8. [8]

    Kaixuan Chen, Dalin Zhang, Lina Yao, Bin Guo, Zhiwen Yu, and Yunhao Liu

  9. [9]

    Surv.54, 4, Article 77 (may 2021), 40 pages

    Deep Learning for Sensor-Based Human Activity Recognition: Overview, Challenges, and Opportunities.ACM Comput. Surv.54, 4, Article 77 (may 2021), 40 pages. https://doi.org/10.1145/3447744

  10. [10]

    Ling Chen, Yi Zhang, and Liangying Peng. 2020. METIER: A Deep Multi-Task Learning Based Activity and User Recognition Model Using Wearable Sensors. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.4, 1, Article 5 (mar 2020), 18 pages. https://doi.org/10.1145/3381012

  11. [11]

    Si-An Chen, Chun-Liang Li, Nate Yoder, Sercan O Arik, and Tomas Pfister. 2023. Tsmixer: An all-mlp architecture for time series forecasting.arXiv preprint arXiv:2303.06053(2023)

  12. [12]

    Berken Utku Demirel and Christian Holz. 2023. Finding Order in Chaos: A Novel Data Augmentation Method for Time Series in Contrastive Learning. arXiv:2309.13439 [cs.LG]

  13. [13]

    Emadeldeen Eldele, Mohamed Ragab, Zhenghua Chen, Min Wu, Chee-Keong Kwoh, Xiaoli Li, and Cuntai Guan. 2023. Self-supervised contrastive representa- tion learning for semi-supervised time-series classification.IEEE Transactions on Pattern Analysis and Machine Intelligence(2023)

  14. [14]

    Ahmed Frikha, Haokun Chen, Denis Krompaß, Thomas Runkler, and Volker Tresp. 2023. Towards data-free domain generalization. InAsian Conference on Machine Learning. PMLR, 327–342

  15. [15]

    John Cristian Borges Gamboa. 2017. Deep learning for time-series analysis.arXiv preprint arXiv:1701.01887(2017)

  16. [16]

    Ziqi Gao, Yuntao Wang, Jianguo Chen, Junliang Xing, Shwetak Patel, Xin Liu, and Yuanchun Shi. 2023. MMTSA: Multi-Modal Temporal Segment Attention Network for Efficient Human Activity Recognition.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies7, 3 (2023), 1–26

  17. [17]

    Juan Haladjian. 2019. The wearables development toolkit: an integrated develop- ment environment for activity recognition applications.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies3, 4 (2019), 1–26

  18. [18]

    Harish Haresamudram, Irfan Essa, and Thomas Plötz. 2021. Contrastive Predictive Coding for Human Activity Recognition.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.5, 2, Article 65 (jun 2021), 26 pages. https://doi.org/10.1145/ 3463506

  19. [19]

    Harish Haresamudram, Irfan Essa, and Thomas Plötz. 2022. Assessing the state of self-supervised human activity recognition using wearables.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies6, 3 (2022), 1–47

  20. [20]

    Huan He, Owen Queen, Teddy Koker, Consuelo Cuevas, Theodoros Tsiligkaridis, and Marinka Zitnik. 2023. Domain Adaptation for Time Series Under Feature and Label Shifts.arXiv preprint arXiv:2302.03133(2023)

  21. [21]

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition. 770–778

  22. [22]

    Zhiqing Hong, Zelong Li, Shuxin Zhong, Wenjun Lyu, Haotian Wang, Yi Ding, Tian He, and Desheng Zhang. 2024. CrossHAR: Generalizing Cross-dataset Human Activity Recognition via Hierarchical Self-Supervised Pretraining.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.8, 2, Article 64 (may 2024), 26 pages. https://doi.org/10.1145/3659597

  23. [23]

    Zhiqing Hong, Yiwei Song, Zelong Li, Anlan Yu, Shuxin Zhong, Yi Ding, Tian He, and Desheng Zhang. 2025. LLM4HAR: Generalizable On-device Human Activity Recognition with Pretrained LLMs. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2(Toronto ON, Canada) (KDD ’25). Association for Computing Machinery, New York, NY...

  24. [24]

    Zhiqing Hong, Guang Wang, Wenjun Lyu, Baoshen Guo, Yi Ding, Haotian Wang, Shuai Wang, Yunhuai Liu, and Desheng Zhang. 2022. CoMiner: na- tionwide behavior-driven unsupervised spatial coordinate mining from un- certain delivery events. InProceedings of the 30th International Conference on Advances in Geographic Information Systems (SIGSPATIAL ’22). Associa...

  25. [25]

    Zhiqing Hong, Weibing Wang, Anlan Yu, Shuxin Zhong, Haotian Wang, Yi Ding, Tian He, and Desheng Zhang. 2025. Experience Paper: Nationwide Human Behav- ior Sensing in Last-mile Delivery. InProceedings of the 31st Annual International Conference on Mobile Computing and Networking(Kerry Hotel, Hong Kong, Hong Kong, China)(ACM MOBICOM ’25). Association for Co...

  26. [26]

    Rong Hu, Ling Chen, Shenghuan Miao, and Xing Tang. 2023. Swl-adapt: An unsupervised domain adaptation model with sample weight learning for cross- user wearable human activity recognition. InProceedings of the AAAI Conference on artificial intelligence, Vol. 37. 6012–6020

  27. [27]

    Jizhou Huang, Haifeng Wang, Yibo Sun, Yunsheng Shi, Zhengjie Huang, An Zhuo, and Shikun Feng. 2022. ERNIE-GeoL: A Geography-and-Language Pre- trained Model and its Applications in Baidu Maps. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining(Washington DC, USA)(KDD ’22). Association for Computing Machinery, New York, N...

  28. [28]

    Sozo Inoue, Paula Lago, Tahera Hossain, Tittaya Mairittha, and Nattaya Mairittha

  29. [29]

    ACM Interact

    Integrating Activity Recognition and Nursing Care Records: The System, Deployment, and a Verification Study.Proc. ACM Interact. Mob. Wearable Ubiqui- tous Technol.3, 3, Article 86 (sep 2019), 24 pages. https://doi.org/10.1145/3351244

  30. [30]

    Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller. 2019. Deep learning for time series classification: a review.Data mining and knowledge discovery33, 4 (2019), 917–963

  31. [31]

    Yash Jain, Chi Ian Tang, Chulhong Min, Fahim Kawsar, and Akhil Mathur. 2022. Collossl: Collaborative self-supervised learning for human activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technolo- gies6, 1 (2022), 1–28

  32. [32]

    Jeya Vikranth Jeyakumar, Ankur Sarker, Luis Antonio Garcia, and Mani Srivas- tava. 2023. X-CHAR: A Concept-Based Explainable Complex Human Activity Recognition Model.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.7, 1, Article 17 (mar 2023), 28 pages. https://doi.org/10.1145/3580804

  33. [33]

    Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of naacL-HLT, Vol. 1. 2

  34. [34]

    Donghyun Kim, Kaihong Wang, Stan Sclaroff, and Kate Saenko. 2022. A broad study of pre-training for domain generalization and adaptation. InEuropean Conference on Computer Vision. Springer, 621–638

  35. [35]

    Daehee Kim, Youngjun Yoo, Seunghyun Park, Jinkyu Kim, and Jaekoo Lee. 2021. Selfreg: Self-supervised contrastive regularization for domain generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9619– 9628

  36. [36]

    Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Ferran Alet, Suman Ravuri, Timo Ewalds, Zach Eaton-Rosen, Weihua Hu, et al . 2023. Learning skillful medium-range global weather forecasting. Science(2023), eadi2336

  37. [37]

    Gupta, and Dezhi Hong

    Shuheng Li, Ranak Roy Chowdhury, Jingbo Shang, Rajesh K. Gupta, and Dezhi Hong. 2021. UniTS: Short-Time Fourier Inspired Neural Networks for Sen- sory Time Series Classification. InProceedings of the 19th ACM Conference on Embedded Networked Sensor Systems(Coimbra, Portugal)(SenSys ’21). As- sociation for Computing Machinery, New York, NY, USA, 234–247. h...

  38. [38]

    Shiqi Lin, Zhizheng Zhang, Zhipeng Huang, Yan Lu, Cuiling Lan, Peng Chu, Quanzeng You, Jiang Wang, Zicheng Liu, Amey Parulkar, et al . 2023. Deep frequency filtering for domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11797–11807

  39. [39]

    Minghao Liu, Shengqi Ren, Siyuan Ma, Jiahui Jiao, Yizhou Chen, Zhiguang Wang, and Wei Song. 2021. Gated transformer networks for multivariate time series classification.arXiv preprint arXiv:2103.14438(2021)

  40. [40]

    Yong Liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, and Mingsheng Long. 2023. iTransformer: Inverted Transformers Are Effective for Time Series Forecasting. arXiv:2310.06625 [cs.LG] GenHAR: Generalizing Cross-domain Human Activity Recognition for Last-mile Delivery KDD ’26, August 09–13, 2026, Jeju Island, Republic of Korea

  41. [41]

    Zhen Liu, Qianli Ma, Peitian Ma, and Linghao Wang. 2023. Temporal-frequency co-training for time series semi-supervised learning. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 8923–8931

  42. [42]

    Wang Lu, Jindong Wang, Yiqiang Chen, Sinno Jialin Pan, Chunyu Hu, and Xin Qin. 2022. Semantic-discriminative mixup for generalizable sensor-based cross- domain activity recognition.Proceedings of the ACM on Interactive, Mobile, Wear- able and Ubiquitous Technologies6, 2 (2022), 1–19

  43. [43]

    Wang Lu, Jindong Wang, Haoliang Li, Yiqiang Chen, and Xing Xie. 2022. Domain- invariant Feature Exploration for Domain Generalization.Transactions on Ma- chine Learning Research(2022)

  44. [44]

    Wenjun Lyu, Kexin Zhang, Baoshen Guo, Zhiqing Hong, Guang Yang, Guang Wang, Yu Yang, Yunhuai Liu, and Desheng Zhang. 2022. Towards Fair Work- load Assessment via Homogeneous Order Grouping in Last-mile Delivery. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management(Atlanta, GA, USA)(CIKM ’22). Association for Comput...

  45. [45]

    Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Lifeng Dong, Ruiping Wang, Jilong Xue, and Furu Wei. 2024. The era of 1- bit llms: All large language models are in 1.58 bits.arXiv preprint arXiv:2402.17764 1, 4 (2024)

  46. [46]

    Nattaya Mairittha, Tittaya Mairittha, Paula Lago, and Sozo Inoue. 2021. CrowdAct: Achieving High-Quality Crowdsourced Datasets in Mobile Activity Recognition. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.5, 1, Article 50 (mar 2021), 32 pages. https://doi.org/10.1145/3432222

  47. [47]

    Mohammad Malekzadeh, Richard G Clegg, Andrea Cavallaro, and Hamed Had- dadi. 2019. Mobile sensor data anonymization. InProceedings of the international conference on internet of things design and implementation. 49–58

  48. [48]

    Udomporn Manupibul, Ratikanlaya Tanthuwapathom, Wimonrat Jarumethi- tanont, Panya Kaimuk, Wat Limroongreungrat, and Warakorn Charoensuk

  49. [49]

    https://doi.org/10.1038/s41598-023-37761-2

    Integration of force and IMU sensors for developing low-cost portable gait measurement system in lower extremities.Scientific Reports13 (06 2023). https://doi.org/10.1038/s41598-023-37761-2

  50. [50]

    Alan Mazankiewicz, Klemens Böhm, and Mario Berges. 2020. Incremental Real- Time Personalization in Human Activity Recognition Using Domain Adaptive Batch Normalization.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.4, 4, Article 144 (dec 2020), 20 pages. https://doi.org/10.1145/3432230

  51. [51]

    Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam

    Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. 2023. A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In International Conference on Learning Representations

  52. [52]

    Riccardo Presotto, Sannara Ek, Gabriele Civitarese, François Portet, Philippe Lalanda, and Claudio Bettini. 2023. Combining Public Human Activity Recogni- tion Datasets to Mitigate Labeled Data Scarcity.arXiv preprint arXiv:2306.13735 (2023)

  53. [53]

    Hangwei Qian, Sinno Jialin Pan, and Chunyan Miao. 2021. Latent independent excitation for generalizable sensor-based cross-person activity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 11921–11929

  54. [54]

    Xin Qin, Yiqiang Chen, Jindong Wang, and Chaohui Yu. 2019. Cross-dataset activity recognition via adaptive spatial-temporal transfer learning.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies3, 4 (2019), 1–25

  55. [55]

    Xin Qin, Jindong Wang, Shuo Ma, Wang Lu, Yongchun Zhu, Xing Xie, and Yiqiang Chen. 2023. Generalizable Low-Resource Activity Recognition with Diverse and Discriminative Representation Learning. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining(Long Beach, CA, USA)(KDD ’23). Association for Computing Machinery, New York...

  56. [56]

    Xia Qingxin, Atsushi Wada, Joseph Korpela, Takuya Maekawa, and Yasuo Namioka. 2019. Unsupervised factory activity recognition with wearable sensors using process instruction information.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies3, 2 (2019), 1–23

  57. [57]

    Jorge-L Reyes-Ortiz, Luca Oneto, Albert Samà, Xavier Parra, and Davide An- guita. 2016. Transition-aware human activity recognition using smartphones. Neurocomputing171 (2016), 754–767

  58. [58]

    Aaqib Saeed, Tanir Ozcelebi, and Johan Lukkien. 2019. Multi-task self-supervised learning for human activity detection.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies3, 2 (2019), 1–30

  59. [59]

    Amray Schwabe, Joel Persson, and Stefan Feuerriegel. 2021. Predicting covid-19 spread from large-scale mobility data. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3531–3539

  60. [60]

    Shiv Shankar, Vihari Piratla, Soumen Chakrabarti, Siddhartha Chaudhuri, Preethi Jyothi, and Sunita Sarawagi. 2018. Generalizing Across Domains via Cross- Gradient Training. InInternational Conference on Learning Representations

  61. [61]

    Shuai Shao, Yu Guan, Bing Zhai, Paolo Missier, and Thomas Plötz. 2023. Con- vBoost: Boosting ConvNets for Sensor-Based Activity Recognition.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.7, 2, Article 75 (jun 2023), 21 pages. https://doi.org/10.1145/3596234

  62. [62]

    Taoran Sheng and Manfred Huber. 2020. Weakly Supervised Multi-Task Repre- sentation Learning for Human Activity Analysis Using Wearables.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.4, 2, Article 57 (jun 2020), 18 pages. https://doi.org/10.1145/3397330

  63. [63]

    Muhammad Shoaib, Stephan Bosch, Ozlem Durmaz Incel, Hans Scholten, and Paul JM Havinga. 2014. Fusion of smartphone motion sensors for physical activity recognition.Sensors14, 6 (2014), 10146–10176

  64. [64]

    Sima Siami-Namini, Neda Tavakoli, and Akbar Siami Namin. 2019. The perfor- mance of LSTM and BiLSTM in forecasting time series. In2019 IEEE International conference on big data (Big Data). IEEE, 3285–3292

  65. [65]

    Bhavuk Singhal, Anshu Aditya, Lokesh Todwal, Shubham Jain, and Debashis Mukherjee. 2024. GeoIndia: A Seq2Seq Geocoding Approach for Indian Addresses. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track. 395–407

  66. [66]

    Allan Stisen, Henrik Blunck, Sourav Bhattacharya, Thor Siiger Prentow, Mikkel Baun Kjærgaard, Anind Dey, Tobias Sonne, and Mads Møller Jensen

  67. [67]

    InProceedings of the 13th ACM Conference on Embedded Networked Sensor Systems(Seoul, South Korea)(Sen- Sys ’15)

    Smart Devices Are Different: Assessing and MitigatingMobile Sens- ing Heterogeneities for Activity Recognition. InProceedings of the 13th ACM Conference on Embedded Networked Sensor Systems(Seoul, South Korea)(Sen- Sys ’15). Association for Computing Machinery, New York, NY, USA, 127–140. https://doi.org/10.1145/2809695.2809718

  68. [68]

    Jie Su, Zhenyu Wen, Tao Lin, and Yu Guan. 2022. Learning disentangled behaviour patterns for wearable-based human activity recognition.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies6, 1 (2022), 1–19

  69. [69]

    Chi Ian Tang, Ignacio Perez-Pozuelo, Dimitris Spathis, Soren Brage, Nick Ware- ham, and Cecilia Mascolo. 2021. Selfhar: Improving human activity recognition through self-training with unlabeled data.Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies5, 1 (2021), 1–30

  70. [70]

    Yu Tang, Leong Hou U, Yilun Cai, Nikos Mamoulis, and Reynold Cheng. 2013. Earth mover’s distance based similarity search at scale.Proceedings of the VLDB Endowment7, 4 (2013), 313–324

  71. [71]

    Ilya O Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, et al. 2021. Mlp-mixer: An all-mlp architecture for vision.Advances in neural information processing systems34 (2021), 24261–24272

  72. [72]

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need.Advances in neural information processing systems30 (2017)

  73. [73]

    Haohan Wang, Xindi Wu, Zeyi Huang, and Eric P Xing. 2020. High-frequency component helps explain the generalization of convolutional neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8684–8694

  74. [74]

    Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Tao Qin, Wang Lu, Yiqiang Chen, Wenjun Zeng, and Philip Yu. 2022. Generalizing to unseen domains: A survey on domain generalization.IEEE Transactions on Knowledge and Data Engineering(2022)

  75. [75]

    Jindong Wang, Vincent W Zheng, Yiqiang Chen, and Meiyu Huang. 2018. Deep transfer learning for cross-domain activity recognition. Inproceedings of the 3rd International Conference on Crowd Science and Engineering. 1–8

  76. [76]

    Zhiguang Wang, Weizhong Yan, and Tim Oates. 2017. Time series classification from scratch with deep neural networks: A strong baseline. In2017 International joint conference on neural networks (IJCNN). IEEE, 1578–1585

  77. [77]

    Haixu Wu, Jiehui Xu, Jianmin Wang, and Mingsheng Long. 2021. Autoformer: De- composition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems34 (2021), 22419–22430

  78. [78]

    Qingxin Xia, Joseph Korpela, Yasuo Namioka, and Takuya Maekawa. 2020. Robust Unsupervised Factory Activity Recognition with Body-Worn Accelerometer Using Temporal Structure of Multiple Sensor Data Motifs.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.4, 3, Article 97 (sep 2020), 30 pages. https: //doi.org/10.1145/3411836

  79. [79]

    Huatao Xu, Pengfei Zhou, Rui Tan, and Mo Li. 2023. Practically Adopting Human Activity Recognition. InProceedings of the 29th Annual International Conference on Mobile Computing and Networking. 1–15

  80. [80]

    Huatao Xu, Pengfei Zhou, Rui Tan, Mo Li, and Guobin Shen. 2021. LIMU-BERT: Unleashing the Potential of Unlabeled Data for IMU Sensing Applications. In Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems (Coimbra, Portugal)(SenSys ’21). Association for Computing Machinery, New York, NY, USA, 220–233. https://doi.org/10.1145/3485730.3485937

Showing first 80 references.