OrganicHAR: Towards Activity Discovery in Organic Settings for Privacy Preserving Sensors Using Efficient Video Analysis

Adriano Soares; Ana Vasconcelos; Cristina Mendes Santos; Filippo Talami; In\^es Silva; Joana Couto da Silva; Mayank Goel; Prasoon Patidar; Ricardo Gra\c{c}a; Riku Arakawa

arxiv: 2605.18455 · v1 · pith:Q3SU22PAnew · submitted 2026-05-18 · 💻 cs.HC

OrganicHAR: Towards Activity Discovery in Organic Settings for Privacy Preserving Sensors Using Efficient Video Analysis

Prasoon Patidar , Riku Arakawa , Ricardo Gra\c{c}a , R\'uben Moutinho , Adriano Soares , Ana Vasconcelos , Filippo Talami , Joana Couto da Silva

show 4 more authors

In\^es Silva Cristina Mendes Santos Mayank Goel Yuvraj Agarwal

This is my paper

Pith reviewed 2026-05-20 08:34 UTC · model grok-4.3

classification 💻 cs.HC

keywords human activity recognitionprivacy preserving sensorsactivity discoveryvision language modelsambient sensinghome monitoringsignal patternsefficient video use

0 comments

The pith

OrganicHAR discovers home activities by letting privacy-preserving sensors first find their own repeatable signal patterns and label them with video models only at those moments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that human activity recognition can work in real homes by reversing the usual order: privacy-preserving sensors like radar and thermal arrays first locate natural changes in their signals, then a vision language model is consulted only at those instants to supply labels that match what the sensors can actually tell apart. A sympathetic reader would care because existing methods either demand extensive per-home labeled data or rely on always-on cameras that raise privacy issues and fail when sensor views differ from camera views. By anchoring discovery in sensor-detectable patterns rather than camera-visible categories, the system produces user- and environment-specific activities that remain usable after the video step ends.

Core claim

OrganicHAR identifies naturally occurring signal patterns using privacy-preserving sensors, applies vision language models only during these key moments for scene understanding, and discovers discrete activity labels at granularities the sensors can reliably detect. With twelve participants it reaches 79 percent accuracy on four to five coarse activities using only ambient radar, lidar, and thermal arrays, and 73 percent on eight to nine fine-grained activities once wearable IMU, depth, and pose sensors are added, while averaging 77 percent accuracy across setups and surfacing four to eight categories per user that total fifteen distinct ones overall. Video queries fall by 90 percent because

What carries the argument

Sensor-driven detection of signal pattern changes that selectively triggers brief video analysis for labeling, ensuring every discovered activity stays within the discrimination power of the local sensors.

Load-bearing premise

Naturally occurring signal patterns from the sensors map to discrete, repeatable human activities whose labels a vision language model can supply from short triggered clips at a granularity the sensors can later distinguish without video.

What would settle it

Measure whether recognition accuracy stays near 77 percent when the system runs on new participants in completely unseen homes with no further video labeling or model adjustment.

Figures

Figures reproduced from arXiv: 2605.18455 by Adriano Soares, Ana Vasconcelos, Cristina Mendes Santos, Filippo Talami, In\^es Silva, Joana Couto da Silva, Mayank Goel, Prasoon Patidar, Ricardo Gra\c{c}a, Riku Arakawa, R\'uben Moutinho, Yuvraj Agarwal.

**Figure 2.** Figure 2: Visualizing information across various activities from various privacy-preserving sensors inspired by our prior work [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Overall architecture of OrganicHAR. Raw sensor signals from different hardware configurations (§ [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Kitchen environments used in our study: (left) Kitchen 1 with compact galley layout, (middle) Kitchen 2 with island [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗

**Figure 5.** Figure 5: Percentage of video data requiring VLM analysis across configurations. Our approach processes only 9-11% of total video data, demonstrating efficiency compared to continuous monitoring. Granularity Metrics Sensor Config Ambient (Basic) Only Ambient (Basic)+ Wearable (IMU) Ambient (Advanced)+ Wearable (IMU) Conservative Accuracy 90.4%±8.9% 91.1%±6.7% 91.7%±6.9% F1 Score 89.4%±9.8% 90.6%±6.4% 90.3%±7.4% B… view at source ↗

**Figure 6.** Figure 6: Discovered activity labels across three semantic granularity settings: Conservative ( [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

**Figure 7.** Figure 7: Per-participant accuracy across three sensing configurations. Kitchen 1 participants (P1-P4) show consistently high [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

**Figure 8.** Figure 8: Per-participant F1 scores across sensing configurations, revealing sharper performance drops than accuracy metrics [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

**Figure 9.** Figure 9: Confusion matrices showing the HAR performance using basic ambient sensors across three granularity settings. As [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 10.** Figure 10: Incremental training analysis of OrganicHAR: (a) Count of VLM queries after incorporating new training session. (b) [PITH_FULL_IMAGE:figures/full_fig_p018_10.png] view at source ↗

**Figure 11.** Figure 11: Kitchen environments used in real-world home deployments: (Home-1) compact galley layout captured from overhead [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗

**Figure 12.** Figure 12: Overall recognition accuracy in real-world [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗

**Figure 14.** Figure 14: Interface for customizing the sensors to be used and activity labels. The percentage(%) values show how well a [PITH_FULL_IMAGE:figures/full_fig_p021_14.png] view at source ↗

**Figure 16.** Figure 16: Impact of frame rate on label discovery perfor [PITH_FULL_IMAGE:figures/full_fig_p031_16.png] view at source ↗

**Figure 17.** Figure 17: Confusion matrices showing activity recognition performance for [PITH_FULL_IMAGE:figures/full_fig_p032_17.png] view at source ↗

**Figure 18.** Figure 18: Confusion matrices showing activity recognition performance for [PITH_FULL_IMAGE:figures/full_fig_p032_18.png] view at source ↗

read the original abstract

Deploying human activity recognition (HAR) at home is still rare because sensor signals vary wildly across houses, people, and time, essentially requiring in-situ data collection and training. Prior approaches use cameras to generate training labels for privacy-preserving sensors (LiDAR, RADAR, Thermal), but this forces sensors to detect predefined activities that cameras can see yet the sensors themselves cannot reliably distinguish. In this work, we introduce OrganicHAR, an activity discovery framework that inverts this relationship by placing sensor capabilities at the center of activity discovery. Our approach identifies naturally occurring signal patterns using privacy-preserving sensors, leverages Vision Language Models (VLMs) only during these key moments for scene understanding, and discovers discrete activity labels at granularities that these sensors can reliably detect. Our evaluation with 12 participants demonstrates OrganicHAR's effectiveness: it achieves 79% accuracy for coarse (4-5) activities using only basic ambient sensors (radar, lidar, thermal arrays), and 73% accuracy for fine-grained (8-9) activities when a wearable IMU, depth, and pose sensor are added. OrganicHAR maintains 77% accuracy on average across configurations while discovering 4-8 categories per user (15 across all users) tailored to each environment and sensor capabilities. By triggering video processing only at key moments identified by local sensors, we reduce queries to VLM by 90%, enabling practical and privacy-preserving activity recognition in natural settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces OrganicHAR, an activity discovery framework for privacy-preserving human activity recognition in organic home settings. It uses ambient sensors (radar, lidar, thermal arrays) and optionally wearables (IMU, depth, pose) to detect natural signal patterns, triggering VLMs only at those moments to generate discrete activity labels tailored to sensor capabilities and environments. Evaluation with 12 participants reports 79% accuracy on 4-5 coarse activities with basic sensors, 73% on 8-9 fine-grained activities with added sensors, 77% average accuracy, discovery of 4-8 categories per user (15 total), and 90% reduction in VLM queries.

Significance. If the central claims hold after addressing validation gaps, the work offers a practical path to in-situ HAR that avoids predefined activity taxonomies and constant video use, potentially improving privacy and adaptability across households. The query reduction and per-user category discovery are concrete strengths that could influence sensor-driven discovery methods in HCI and ubiquitous computing.

major comments (2)

[Evaluation] Evaluation with 12 participants (abstract and results section): the reported accuracies (79% coarse, 73% fine-grained, 77% average) are computed against VLM-assigned labels with no description of independent human ground-truth collection, activity boundary definitions, or inter-rater agreement metrics. This is load-bearing for the effectiveness claim because the numbers measure reproduction of VLM outputs rather than independently verifiable sensor-distinguishable activities.
[Abstract] Abstract and method overview: the assumption that naturally occurring sensor signal patterns correspond to repeatable, VLM-labelable activities at a granularity the sensors can distinguish is stated but not tested against a hold-out human-annotated set or consistency checks on VLM outputs. Without this, the 90% query reduction and category counts risk being self-referential.

minor comments (2)

[Method] Clarify in the method section how signal pattern detection thresholds are set and whether they are user- or environment-specific.
[Abstract] The abstract lists sensor configurations but does not explicitly state the exact participant demographics or house types; adding a short table or sentence would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our evaluation methodology and the underlying assumptions of OrganicHAR. We address each major comment below and commit to revisions that strengthen the validation of our claims without altering the core contributions.

read point-by-point responses

Referee: [Evaluation] Evaluation with 12 participants (abstract and results section): the reported accuracies (79% coarse, 73% fine-grained, 77% average) are computed against VLM-assigned labels with no description of independent human ground-truth collection, activity boundary definitions, or inter-rater agreement metrics. This is load-bearing for the effectiveness claim because the numbers measure reproduction of VLM outputs rather than independently verifiable sensor-distinguishable activities.

Authors: We agree that the primary reported accuracies are measured against VLM-assigned labels, as the framework uses VLMs to generate discrete activity categories from sensor-triggered moments. This choice enables discovery of user- and environment-specific activities without relying on predefined taxonomies. To address the concern directly, we will revise the manuscript to include a new subsection on independent validation: we collected human annotations for a random 20% subset of the detected events from three annotators, defined activity boundaries based on sensor signal changes, and will report inter-rater agreement (Fleiss' kappa) along with agreement rates between human labels and VLM outputs. This addition will demonstrate that the sensor-based classifiers capture activities distinguishable beyond VLM reproduction alone. revision: yes
Referee: [Abstract] Abstract and method overview: the assumption that naturally occurring sensor signal patterns correspond to repeatable, VLM-labelable activities at a granularity the sensors can distinguish is stated but not tested against a hold-out human-annotated set or consistency checks on VLM outputs. Without this, the 90% query reduction and category counts risk being self-referential.

Authors: The 90% VLM query reduction stems from the sensor-driven pattern detection step, which operates independently of any labels and triggers video analysis only at candidate moments; this metric is therefore not self-referential. For the discovered categories and their alignment with sensor capabilities, we acknowledge the value of additional checks. In the revised manuscript we will add: (i) consistency analysis of VLM outputs by re-querying a subset of moments with varied prompts and reporting label stability, and (ii) a hold-out human-annotated evaluation where annotators assess whether the discovered categories correspond to repeatable, sensor-distinguishable behaviors in the raw signals. These steps will provide external evidence that the per-user categories (4-8) reflect genuine activity structure rather than VLM artifacts. revision: yes

Circularity Check

0 steps flagged

No circularity in empirical evaluation

full rationale

The paper presents a system and empirical evaluation with 12 participants that uses local sensors to trigger VLM labeling at detected signal patterns, then reports classification accuracies on the resulting (sensor, VLM-label) pairs. No mathematical derivation, equations, fitted parameters renamed as predictions, or self-citation chains appear in the provided text. The reported 79%/73%/77% accuracies and discovered category counts are direct outcomes of the data collection and training process rather than any reduction to the inputs by construction. Label quality concerns belong to assumption validity, not circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the empirical observation that sensor signals contain repeatable patterns that align with human activities and that a VLM can supply accurate scene descriptions at the moments those patterns occur. No free parameters, axioms, or invented entities are explicitly introduced in the abstract.

pith-pipeline@v0.9.0 · 5843 in / 1361 out tokens · 44222 ms · 2026-05-20T08:34:00.908236+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We employ clustering in the feature space of each sensor modality … HDBSCAN … scoring function … min_cluster_size …
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We also leverage the temporal changes … Gaussian Mixture Model (GMM) … anomaly score
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

relaxation parameter λ … hierarchical clustering … Sij ≥ 1 − λ

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

82 extracted references · 82 canonical work pages

[1]

Ramokapane, and Jose M

Noura Abdi, Kopo M. Ramokapane, and Jose M. Such. 2019. More than Smart Speakers: Security and Privacy Perceptions of Smart Home Personal Assistants. InFifteenth Symposium on Usable Privacy and Security (SOUPS 2019). USENIX Association, Santa Clara, CA, 451–466. https://www.usenix.org/conference/soups2019/presentation/abdi

work page 2019
[2]

Antonio A Aguileta, Ramon F Brena, Oscar Mayora, Erik Molino-Minero-Re, and Luis A Trejo. 2019. Multi-sensor fusion for activity recognition—A survey.Sensors19, 17 (2019), 3808

work page 2019
[3]

Karan Ahuja, Yue Jiang, Mayank Goel, and Chris Harrison. 2021. Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems(Yokohama, Japan)(CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 292, 10 pages...

work page doi:10.1145/3411764.3445138 2021
[4]

Riku Arakawa, Jill Fain Lehman, and Mayank Goel. 2024. PrISM-Q&A: Step-Aware Voice Assistant on a Smartwatch Enabled by Multimodal Procedure Tracking and Large Language Models.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.8, 4 (Nov. 2024), 180:1–180:26. https://doi.org/10.1145/3699759

work page doi:10.1145/3699759 2024
[5]

Riku Arakawa, Prasoon Patidar, Will Page, Jill Lehman, and Mayank Goel. 2025. Scaling Context-Aware Task Assistants that Learn from Demonstration and Adapt through Mixed-Initiative Dialogue. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology (UIST ’25). Association for Computing Machinery, New York, NY, USA, Article 1...

work page doi:10.1145/3746059.3747700 2025
[6]

Riku Arakawa, Hiromu Yakura, and Mayank Goel. 2024. PrISM-Observer: Intervention Agent to Help Users Perform Everyday Procedures Sensed using a Smartwatch. InProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST ’24). Association for Computing Machinery, New York, NY, USA, 1–16. https://doi.org/10.1145/3654777.3676350

work page doi:10.1145/3654777.3676350 2024
[7]

DeMeo, Haarika A

Riku Arakawa, Hiromu Yakura, Vimal Mollyn, Suzanne Nie, Emma Russell, Dustin P. DeMeo, Haarika A. Reddy, Alexander K. Maytin, Bryan T. Carroll, Jill Fain Lehman, and Mayank Goel. 2023. PrISM-Tracker: A Framework for Multimodal Procedure Tracking Using Wearable Sensors and State Transition Information with User-Driven Handling of Errors and Uncertainty.Pro...

work page doi:10.1145/3569504 2023
[8]

Paola Ariza Colpas, Enrico Vicario, Emiro De-La-Hoz-Franco, Marlon Pineres-Melo, Ana Oviedo-Carrascal, and Fulvio Patara. 2020. Unsupervised Human Activity Recognition Using the Clustering Approach: A Review.Sensors20, 9 (Jan. 2020), 2702. https://doi.org/10. 3390/s20092702 Number: 9 Publisher: Multidisciplinary Digital Publishing Institute

work page 2020
[9]

Luca Arrotta, Claudio Bettini, Gabriele Civitarese, and Michele Fiori. 2024. ContextGPT: Infusing LLMs Knowledge into Neuro-Symbolic Activity Recognition Models. In2024 IEEE International Conference on Smart Computing (SMARTCOMP). IEEE, Osaka, Japan, 55–62

work page 2024
[10]

Autonomous. 2024. AUTONOMOUS; Co-Designing Independence — autonomous-project.com. https://www.autonomous-project.com/. [Accessed 10-10-2025]

work page 2024
[11]

Awan-Ur-Rahman. 2023. Understanding Soft Voting and Hard Voting: A Comparative Analysis of Ensemble Learning Meth- ods. https://medium.com/@awanurrahman.cse/understanding-soft-voting-and-hard-voting-a-comparative-analysis-of-ensemble- learning-methods-db0663d2c008

work page 2023
[12]

Oresti Banos, Juan-Manuel Galvez, Miguel Damas, Hector Pomares, and Ignacio Rojas. 2014. Window Size Impact in Human Activity Recognition.Sensors14, 4 (April 2014), 6474–6499. https://doi.org/10.3390/s140406474 Number: 4 Publisher: Multidisciplinary Digital Publishing Institute

work page doi:10.3390/s140406474 2014
[13]

Sejal Bhalla, Mayank Goel, and Rushil Khurana. 2021. IMU2Doppler: Cross-Modal Domain Adaptation for Doppler-based Activity Recognition Using IMU Data.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies5, 4 (2021), 1–20

work page 2021
[14]

Sarnab Bhattacharya, Rebecca Adaimi, and Edison Thomaz. 2022. Leveraging sound and wrist motion to detect activities of daily living with commodity smartwatches.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies6, 2 (2022), 42:1–42:28. https://doi.org/10.1145/3534582

work page doi:10.1145/3534582 2022
[15]

Florian Bordes, Richard Yuanzhe Pang, Anurag Ajay, Alexander C. Li, Adrien Bardes, Suzanne Petryk, Oscar Mañas, Zhiqiu Lin, Anas Mahmoud, Bargav Jayaraman, Mark Ibrahim, Melissa Hall, Yunyang Xiong, Jonathan Lebensold, Candace Ross, Srihari Jayakumar, Chuan Guo, Diane Bouchacourt, Haider Al-Tahan, Karthik Padthe, Vasu Sharma, Hu Xu, Xiaoqing Ellen Tan, Me...

work page doi:10.48550/arxiv.2405.17247 2024
[16]

Damien Bouchabou, Sao Mai Nguyen, Christophe Lohr, Benoit LeDuc, and Ioannis Kanellos. 2021. A Survey of Human Activity Recognition in Smart Homes Based on IoT Sensors Algorithms: Taxonomies, Challenges, and Opportunities with Deep Learning.Sensors (Basel, Switzerland)21, 18 (Sept. 2021), 6037. https://doi.org/10.3390/s21186037

work page doi:10.3390/s21186037 2021
[17]

Bernheim Brush, Bongshin Lee, Ratul Mahajan, Sharad Agarwal, Stefan Saroiu, and Colin Dixon

A.J. Bernheim Brush, Bongshin Lee, Ratul Mahajan, Sharad Agarwal, Stefan Saroiu, and Colin Dixon. 2011. Home automation in the wild: challenges and opportunities. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’11). Association for Computing Machinery, New York, NY, USA, 2115–2124. https://doi.org/10.1145/1978942.1979249...

work page doi:10.1145/1978942.1979249 2011
[18]

Timothy I Cannings, Yingying Fan, and Richard J Samworth. 2020. Classification with imperfect training labels.Biometrika107, 2 (2020), 311–330

work page 2020
[19]

João Carreira and Andrew Zisserman. 2017. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. In2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. IEEE Computer Society, Honolulu, HI, USA, 4724–4733. https://doi.org/10.1109/CVPR.2017.502

work page doi:10.1109/cvpr.2017.502 2017
[20]

Gabriele Cipriani, Sabrina Danti, Lucia Picchi, Angelo Nuti, and Mario Di Fiorino. 2020. Daily functioning and dementia.Dementia & Neuropsychologia14, 2 (2020), 93–102. https://doi.org/10.1590/1980-57642020dn14-020001

work page doi:10.1590/1980-57642020dn14-020001 2020
[21]

Diane Cook, Narayanan Krishnan, and Parisa Rashidi. 2013. Activity Discovery and Activity Recognition: A New Partnership.IEEE transactions on cybernetics43, 3 (June 2013), 820–828. https://doi.org/10.1109/TSMCB.2012.2216873

work page doi:10.1109/tsmcb.2012.2216873 2013
[22]

Ivan Culjak, David Abram, Tomislav Pribanic, Hrvoje Dzapo, and Mario Cifrek. 2012. A brief introduction to OpenCV. In2012 Proceedings of the 35th International Convention MIPRO. IEEE, Opatija, Croatia, 1725–1730

work page 2012
[23]

Smith, and Flora D

Shohreh Deldari, Hao Xue, Aaqib Saeed, Jiayuan He, Daniel V. Smith, and Flora D. Salim. 2022. Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data. arXiv:2206.02353 [cs.LG]

work page arXiv 2022
[24]

Kaikai Deng, Dong Zhao, Zihan Zhang, Shuyue Wang, Wenxin Zheng, and Huadong Ma. 2024. Midas++: Generating Training Data of mmWave Radars From Videos for Privacy-Preserving Human Sensing With Mobility.IEEE Transactions on Mobile Computing23, 6 (June 2024), 6650–6666. https://doi.org/10.1109/TMC.2023.3325399

work page doi:10.1109/tmc.2023.3325399 2024
[25]

Nathan DeVrio, Vimal Mollyn, and Chris Harrison. 2023. SmartPoser: Arm Pose Estimation with a Smartphone and Smartwatch Using UWB and IMU Data. InProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST ’23). Association for Computing Machinery, New York, NY, USA, 1–11. https://doi.org/10.1145/3586183.3606821

work page doi:10.1145/3586183.3606821 2023
[26]

Ha, Emma Russell, Haarika A

Megan V. Ha, Emma Russell, Haarika A. Reddy, Alexander K. Maytin, Dustin P. DeMeo, Riku Arakawa, Mayank Goel, Jill F. Lehman, and Bryan T. Carroll. 2024. Self-narration for patient monitoring with smartwatch technology in post-operative wound care after dermatologic surgery.Archives of Dermatological Research316, 7 (June 2024), 389. https://doi.org/10.100...

work page doi:10.1007/s00403-024-03149-z 2024
[27]

Harris, K

Charles R. Harris, K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fernández del Río, Mark Wiebe, Pearu Peterson, Pierre Gérard-Marchant, Kevin Shepp...

work page doi:10.1038/s41586-020-2649-2 2020
[28]

Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem, and Juan Carlos Niebles. 2015. ActivityNet: A large-scale video benchmark for human activity understanding. In2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Boston, MA, USA, 961–970. https://doi.org/10.1109/CVPR.2015.7298698

work page doi:10.1109/cvpr.2015.7298698 2015
[29]

Hiremath, Yasutaka Nishimura, Sonia Chernova, and Thomas Plötz

Shruthi K. Hiremath, Yasutaka Nishimura, Sonia Chernova, and Thomas Plötz. 2022. Bootstrapping Human Activity Recognition Systems for Smart Homes from Scratch.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies6, 3 (Sept. 2022), 1–27. https://doi.org/10.1145/3550294

work page doi:10.1145/3550294 2022
[30]

Hiremath and Thomas Plötz

Shruthi K. Hiremath and Thomas Plötz. 2023. The Lifespan of Human Activity Recognition Systems for Smart Homes.Sensors23, 18 (Jan. 2023), 7729. https://doi.org/10.3390/s23187729 Number: 18 Publisher: Multidisciplinary Digital Publishing Institute

work page doi:10.3390/s23187729 2023
[31]

Yash Jain, Chi Ian Tang, Chulhong Min, Fahim Kawsar, and Akhil Mathur. 2022. ColloSSL: Collaborative Self-Supervised Learning for Human Activity Recognition.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.6, 1, Article 17 (mar 2022), 28 pages. https: //doi.org/10.1145/3517246

work page doi:10.1145/3517246 2022
[32]

Ahmad Jalal, Shaharyar Kamal, and Daijin Kim. 2017. A Depth Video-based Human Detection and Activity Recognition using Multi- features and Embedded Hidden Markov Models for Health Care Monitoring Systems.International Journal of Interactive Multimedia and Artificial Intelligence4, Regular Issue (2017), 54–62. https://www.ijimai.org/journal/bibcite/reference/2606

work page 2017
[33]

Tianjie Ju, Yi Hua, Hao Fei, Zhenyu Shao, Yubin Zheng, Haodong Zhao, Mong-Li Lee, Wynne Hsu, Zhuosheng Zhang, and Gongshen Liu. 2025. Watch Out Your Album! On the Inadvertent Privacy Memorization in Multi-Modal Large Language Models. https: //doi.org/10.48550/arXiv.2503.01208 arXiv:2503.01208 [cs]

work page doi:10.48550/arxiv.2503.01208 2025
[34]

Alexander Karpekov, Sonia Chernova, and Thomas Plötz. 2025. DISCOVER: Data-driven Identification of Sub-activities via Clustering and Visualization for Enhanced Activity Recognition in Smart Homes. https://doi.org/10.48550/arXiv.2503.01733 arXiv:2503.01733 [cs]

work page doi:10.48550/arxiv.2503.01733 2025
[35]

Hyeokhyen Kwon, Catherine Tong, Harish Haresamudram, Yan Gao, Gregory D Abowd, Nicholas D Lane, and Thomas Ploetz. 2020. IMUTube: Automatic extraction of virtual on-body accelerometry from video for human activity recognition.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies4, 3 (2020), 1–29

work page 2020
[36]

Gierad Laput and Chris Harrison. 2019. SurfaceSight: A New Spin on Touch, User, and Object Sensing for IoT Experiences. InProceedings of the 2019 CHI Conference on Human Factors in Computing Systems(Glasgow, Scotland Uk)(CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300559

work page doi:10.1145/3290605.3300559 2019
[37]

Gierad Laput, Yang Zhang, and Chris Harrison. 2017. Synthetic Sensors: Towards General-Purpose Sensing. InProc. of the 2017 CHI Conference on Human Factors in Computing Systems(Denver, Colorado, USA)(CHI ’17). ACM, New York, NY, USA, 3986–3999. https://doi.org/10.1145/3025453.3025773 Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 9, No. 4, Ar...

work page doi:10.1145/3025453.3025773 2017
[38]

Guillaume Lemaître, Fernando Nogueira, and Christos K. Aridas. 2017. Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning.Journal of Machine Learning Research18, 17 (2017), 1–5. http://jmlr.org/papers/v18/16-365.html

work page 2017
[39]

Zikang Leng, Amitrajit Bhattacharjee, Hrudhai Rajasekhar, Lizhe Zhang, Elizabeth Bruda, Hyeokhyen Kwon, and Thomas Plötz. 2024. IMUGPT 2.0: Language-Based Cross Modality Transfer for Sensor-Based Human Activity Recognition.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies8, 3 (Aug. 2024), 1–32. https://doi.org/10.1145/3678545

work page doi:10.1145/3678545 2024
[40]

Zikang Leng, Hyeokhyen Kwon, and Thomas Ploetz. 2023. Generating Virtual On-body Accelerometer Data from Virtual Textual Descriptions for Human Activity Recognition. InProceedings of the 2023 ACM International Symposium on Wearable Computers (ISWC ’23). Association for Computing Machinery, New York, NY, USA, 39–43. https://doi.org/10.1145/3594738.3611361

work page doi:10.1145/3594738.3611361 2023
[41]

Zikang Leng, Hyeokhyen Kwon, and Thomas Plötz. 2023. On the Benefit of Generative Foundation Models for Human Activity Recognition. https://doi.org/10.48550/arXiv.2310.12085 arXiv:2310.12085 [cs]

work page doi:10.48550/arxiv.2310.12085 2023
[42]

Dawei Liang, Guihong Li, Rebecca Adaimi, Radu Marculescu, and Edison Thomaz. 2022. AudioIMU: Enhancing Inertial Sensing-Based Activity Recognition with Acoustic Models. InProceedings of the 2022 ACM International Symposium on Wearable Computers(Cambridge, United Kingdom)(ISWC ’22). Association for Computing Machinery, New York, NY, USA, 44–48. https://doi...

work page doi:10.1145/3544794.3558471 2022
[43]

Sicong Liu, Junzhao Du, Anshumali Shrivastava, and Lin Zhong. 2019. Privacy Adversarial Network.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies3, 4 (dec 2019), 1–18. https://doi.org/10.1145/3369816

work page doi:10.1145/3369816 2019
[44]

Tian-Yu Liu. 2009. EasyEnsemble and Feature Selection for Imbalance Data Sets. In2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing. IEEE, Shanghai, China, 517–520. https://doi.org/10.1109/IJCBS.2009.22

work page doi:10.1109/ijcbs.2009.22 2009
[45]

Harsh Lunia. 2024. Can VLMs be used on videos for action recognition? LLMs are Visual Reasoning Coordinators. https://doi.org/10. 48550/arXiv.2407.14834 arXiv:2407.14834 [cs] version: 1

work page arXiv 2024
[46]

Leland McInnes, John Healy, and Steve Astels. 2017. hdbscan: Hierarchical density based clustering.The Journal of Open Source Software 2, 11 (March 2017), 205. https://doi.org/10.21105/joss.00205

work page doi:10.21105/joss.00205 2017
[47]

Mites.io. 2020. Mites.io: a full-stack ubiquitous sensing platform. https://mites.io/

work page 2020
[48]

MMAction2. 2020. OpenMMLab’s Next Generation Video Understanding Toolbox and Benchmark. https://github.com/open-mmlab/ mmaction2

work page 2020
[49]

MMPose. 2020. OpenMMLab Pose Estimation Toolbox and Benchmark. https://github.com/open-mmlab/mmpose

work page 2020
[50]

Vimal Mollyn, Karan Ahuja, Dhruv Verma, Chris Harrison, and Mayank Goel. 2022. SAMoSA: Sensing Activities with Motion and Subsampled Audio.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies6, 3 (2022), 1–19

work page 2022
[51]

Vimal Mollyn, Riku Arakawa, Mayank Goel, Chris Harrison, and Karan Ahuja. 2023. IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and Earbuds. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3544548.3581392

work page doi:10.1145/3544548.3581392 2023
[52]

Sebastian Münzner, Philip Schmidt, Attila Reiss, Michael Hanselmann, Rainer Stiefelhagen, and Robert Dürichen. 2017. CNN-Based Sensor Fusion Techniques for Multimodal Human Activity Recognition. InProceedings of the 2017 ACM International Symposium on Wearable Computers(Maui, Hawaii)(ISWC ’17). Association for Computing Machinery, New York, NY, USA, 158–1...

work page doi:10.1145/3123021.3123046 2017
[53]

OpenAI. 2025. OpenAI API. https://platform.openai.com/docs/api-reference/ Accessed: 2025-04-29

work page 2025
[54]

2020, doi: 10.5281/zenodo.3509134

The pandas development team. 2020.pandas-dev/pandas: Pandas. pandas-dev. https://doi.org/10.5281/zenodo.3509134

work page doi:10.5281/zenodo.3509134 2020
[55]

Preksha Pareek and Ankit Thakkar. 2021. A survey on video-based human action recognition: recent updates, datasets, challenges, and applications.Artificial Intelligence Review54, 3 (2021), 2259–2322

work page 2021
[56]

Prasoon Patidar, Mayank Goel, and Yuvraj Agarwal. 2023. VAX: Using Existing Video and Audio-based Activity Recognition Models to Bootstrap Privacy-Sensitive Sensors.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies7, 3 (Sept. 2023), 1–24. https://doi.org/10.1145/3610907

work page doi:10.1145/3610907 2023
[57]

Pedregosa, G

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python.Journal of Machine Learning Research12 (2011), 2825–2830

work page 2011
[58]

Daniel Perazzo, Natalia Souza Soares, Victor Gouveia de Menezes Lyra, Gustavo Camargo Rocha Lima, Alana Elza Fontes da Gama, Joao Marcelo Xavier Natario Teixeira, and Veronica Teichrieb. 2022. OAK-D as a Platform for Human Movement Analysis: A Case Study. InProceedings of the 23rd Symposium on Virtual and Augmented Reality(Virtual Event, Brazil)(SVR ’21)....

work page doi:10.1145/3488162.3488222 2022
[59]

Prasoon Patidar, Riku Arakawa, Mayank Goel, Yuvraj Agarwal. 2025. OrganicHAR: Open-source repository for the OrganicHAR. https://github.com/synergylabs/OrganicHAR

work page 2025
[60]

Riccardo Presotto, Gabriele Civitarese, and Claudio Bettini. 2022. Federated Clustering and Semi-Supervised learning: A new partnership for personalized Human Activity Recognition.Pervasive and Mobile Computing88 (2022), 101726

work page 2022
[61]

Suneth Ranasinghe, Fadi Al Machot, and Heinrich C Mayr. 2016. A review on applications of activity recognition systems with regard to performance and evaluation.International Journal of Distributed Sensor Networks12, 8 (2016), 1550147716665520. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 9, No. 4, Article 203. Publication date: December 20...

work page 2016
[62]

Juan Rocamonde, Victoriano Montesinos, Elvis Nava, Ethan Perez, and David Lindner. 2024. Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning. https://doi.org/10.48550/arXiv.2310.12921 arXiv:2310.12921 [cs] version: 2

work page doi:10.48550/arxiv.2310.12921 2024
[63]

Laurens Samson, Nimrod Barazani, Sennay Ghebreab, and Yuki M. Asano. 2025. Little Data, Big Impact: Privacy-Aware Visual Language Models via Minimal Tuning. https://doi.org/10.48550/arXiv.2405.17423 arXiv:2405.17423 [cs]

work page doi:10.48550/arxiv.2405.17423 2025
[64]

Khoshgoftaar, Jason Van Hulse, and Amri Napolitano

Chris Seiffert, Taghi M. Khoshgoftaar, Jason Van Hulse, and Amri Napolitano. 2010. RUSBoost: A Hybrid Approach to Alleviating Class Imbalance.IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans40, 1 (Jan. 2010), 185–197. https://doi.org/10.1109/TSMCA.2009.2029559

work page doi:10.1109/tsmca.2009.2029559 2010
[65]

Pekka Siirtola and Juha Röning. 2019. Incremental Learning to Personalize Human Activity Recognition Models: The Importance of Human AI Collaboration.Sensors (Basel, Switzerland)19, 23 (Nov. 2019), 5151. https://doi.org/10.3390/s19235151

work page doi:10.3390/s19235151 2019
[66]

Adane Nega Tarekegn, Mohib Ullah, Faouzi Alaya Cheikh, and Muhammad Sajjad. 2023. Enhancing Human Activity Recognition Through Sensor Fusion And Hybrid Deep Learning Model. In2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW). IEEE, Rhodes Island, Greece, 1–5. https://doi.org/10.1109/ICASSPW59220.2023.10193698

work page doi:10.1109/icasspw59220.2023.10193698 2023
[67]

Maytin, Yash Kumar, Toluwalashe Onamusi, Haarika A

Annalise Vaccarello, Alexander K. Maytin, Yash Kumar, Toluwalashe Onamusi, Haarika A. Reddy, Mayank Goel, Riku Arakawa, Jill Fain Lehman, and Bryan T. Carroll. 2024. Barriers to use of digital assistance for postoperative wound care: a single-center survey of dermatologic surgery patients.Archives of Dermatological Research316, 7 (June 2024), 376. https:/...

work page doi:10.1007/s00403-024-03025-w 2024
[68]

Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, St´ efan J

Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, İlhan Polat, Yu Feng, Eric W. Mo...

work page doi:10.1038/s41592-019-0686-2 2020
[69]

Michalis Vrigkas, Christophoros Nikou, and Ioannis A Kakadiaris. 2015. A review of human activity recognition methods.Frontiers in Robotics and AI2 (2015), 28

work page 2015
[70]

Fali Wang, Zhiwei Zhang, Xianren Zhang, Zongyu Wu, Tzuhao Mo, Qiuhao Lu, Wanjing Wang, Rui Li, Junjie Xu, Xianfeng Tang, Qi He, Yao Ma, Ming Huang, and Suhang Wang. 2024. A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness. arXiv:2411.0335...

work page arXiv 2024
[71]

Shuai Wang, Luoyu Mei, Ruofeng Liu, Wenchao Jiang, Zhimeng Yin, Xianjun Deng, and Tian He. 2025. Multi-Modal Fusion Sensing: A Comprehensive Review of Millimeter-Wave Radar and Its Integration With Other Modalities.IEEE Commun. Surv. Tutorials27, 1 (2025), 322–352. https://doi.org/10.1109/COMST.2024.3398004

work page doi:10.1109/comst.2024.3398004 2025
[72]

Pete Warden, Matthew Stewart, Brian Plancher, Colby Banbury, Shvetank Prakash, Emma Chen, Zain Asgar, Sachin Katti, and Vijay Janapa Reddi. 2022. Machine Learning Sensors. https://doi.org/10.48550/ARXIV.2206.03266

work page doi:10.48550/arxiv.2206.03266 2022
[73]

Why is ’Chicago’ deceptive?

Jason Wu, Chris Harrison, Jeffrey P. Bigham, and Gierad Laput. 2020. Automated Class Discovery and One-Shot Interactions for Acoustic Activity Recognition. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems. ACM, Honolulu HI USA, 1–14. https://doi.org/10.1145/3313831.3376875

work page doi:10.1145/3313831.3376875 2020
[74]

Tong Wu, Murtadha Aldeer, Tahiya Chowdhury, Amber Haynes, Fateme Nikseresht, Mahsa Pahlavikhah Varnosfaderani, Jiechao Gao, Arsalan Heydarian, Brad Campbell, and Jorge Ortiz. 2021. The Smart Building Privacy Challenge. InProceedings of the 8th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation(Coimbra, Portu...

work page doi:10.1145/3486611.3492234 2021
[75]

Chengshuo Xia, Xinrui Fang, Riku Arakawa, and Yuta Sugiura. 2022. VoLearn: A Cross-Modal Operable Motion-Learning System Combined with Virtual Avatar and Auditory Feedback.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.6, 2 (2022), 81:1–81:26. https://doi.org/10.1145/3534576

work page doi:10.1145/3534576 2022
[76]

In: Annals of Operations Research

Kenji Yamanishi, Jun’ichi Takeuchi, Graham J. Williams, and Peter Milne. 2004. On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms.Data Mining and Knowledge Discovery8, 3 (2004), 275–300. https://doi.org/10.1023/B: DAMI.0000023676.72185.7c

work page doi:10.1023/b: 2004
[77]

Murat Yağcı, Tevfik Aytekin, and Fikret S

A. Murat Yağcı, Tevfik Aytekin, and Fikret S. Gürgen. 2016. Balanced random forest for imbalanced data streams. In2016 24th Signal Processing and Communication Application Conference (SIU). IEEE, Zonguldak, Turkey, 1065–1068. https://doi.org/10.1109/SIU.2016. 7495927

work page doi:10.1109/siu.2016 2016
[78]

Nguyen, Taesik Gong, and Sung-Ju Lee

Hyungjun Yoon, Hyeongheon Cha, Hoang C. Nguyen, Taesik Gong, and Sung-Ju Lee. 2024. IMG2IMU: Translating Knowledge from Large-Scale Images to IMU Sensing Applications. https://doi.org/10.48550/arXiv.2209.00945 arXiv:2209.00945 [cs]

work page doi:10.48550/arxiv.2209.00945 2024
[79]

Sojeong Yun and Youn-kyung Lim. 2025. What If Smart Homes Could See Our Homes?: Exploring DIY Smart Home Building Experiences with VLM-Based Camera Sensors. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, USA, 1–22. https://doi.org/10.1145/3706598.3713265

work page doi:10.1145/3706598.3713265 2025
[80]

Shugang Zhang, Zhiqiang Wei, Jie Nie, Lei Huang, Shuang Wang, and Zhen Li. 2017. A Review on Human Activity Recognition Using Vision-Based Method.Journal of Healthcare Engineering2017, 1 (2017), 3090343. https://doi.org/10.1155/2017/3090343 Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 9, No. 4, Article 203. Publication date: December 2025. ...

work page doi:10.1155/2017/3090343 2017

Showing first 80 references.

[1] [1]

Ramokapane, and Jose M

Noura Abdi, Kopo M. Ramokapane, and Jose M. Such. 2019. More than Smart Speakers: Security and Privacy Perceptions of Smart Home Personal Assistants. InFifteenth Symposium on Usable Privacy and Security (SOUPS 2019). USENIX Association, Santa Clara, CA, 451–466. https://www.usenix.org/conference/soups2019/presentation/abdi

work page 2019

[2] [2]

Antonio A Aguileta, Ramon F Brena, Oscar Mayora, Erik Molino-Minero-Re, and Luis A Trejo. 2019. Multi-sensor fusion for activity recognition—A survey.Sensors19, 17 (2019), 3808

work page 2019

[3] [3]

Karan Ahuja, Yue Jiang, Mayank Goel, and Chris Harrison. 2021. Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems(Yokohama, Japan)(CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 292, 10 pages...

work page doi:10.1145/3411764.3445138 2021

[4] [4]

Riku Arakawa, Jill Fain Lehman, and Mayank Goel. 2024. PrISM-Q&A: Step-Aware Voice Assistant on a Smartwatch Enabled by Multimodal Procedure Tracking and Large Language Models.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.8, 4 (Nov. 2024), 180:1–180:26. https://doi.org/10.1145/3699759

work page doi:10.1145/3699759 2024

[5] [5]

Riku Arakawa, Prasoon Patidar, Will Page, Jill Lehman, and Mayank Goel. 2025. Scaling Context-Aware Task Assistants that Learn from Demonstration and Adapt through Mixed-Initiative Dialogue. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology (UIST ’25). Association for Computing Machinery, New York, NY, USA, Article 1...

work page doi:10.1145/3746059.3747700 2025

[6] [6]

Riku Arakawa, Hiromu Yakura, and Mayank Goel. 2024. PrISM-Observer: Intervention Agent to Help Users Perform Everyday Procedures Sensed using a Smartwatch. InProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST ’24). Association for Computing Machinery, New York, NY, USA, 1–16. https://doi.org/10.1145/3654777.3676350

work page doi:10.1145/3654777.3676350 2024

[7] [7]

DeMeo, Haarika A

Riku Arakawa, Hiromu Yakura, Vimal Mollyn, Suzanne Nie, Emma Russell, Dustin P. DeMeo, Haarika A. Reddy, Alexander K. Maytin, Bryan T. Carroll, Jill Fain Lehman, and Mayank Goel. 2023. PrISM-Tracker: A Framework for Multimodal Procedure Tracking Using Wearable Sensors and State Transition Information with User-Driven Handling of Errors and Uncertainty.Pro...

work page doi:10.1145/3569504 2023

[8] [8]

Paola Ariza Colpas, Enrico Vicario, Emiro De-La-Hoz-Franco, Marlon Pineres-Melo, Ana Oviedo-Carrascal, and Fulvio Patara. 2020. Unsupervised Human Activity Recognition Using the Clustering Approach: A Review.Sensors20, 9 (Jan. 2020), 2702. https://doi.org/10. 3390/s20092702 Number: 9 Publisher: Multidisciplinary Digital Publishing Institute

work page 2020

[9] [9]

Luca Arrotta, Claudio Bettini, Gabriele Civitarese, and Michele Fiori. 2024. ContextGPT: Infusing LLMs Knowledge into Neuro-Symbolic Activity Recognition Models. In2024 IEEE International Conference on Smart Computing (SMARTCOMP). IEEE, Osaka, Japan, 55–62

work page 2024

[10] [10]

Autonomous. 2024. AUTONOMOUS; Co-Designing Independence — autonomous-project.com. https://www.autonomous-project.com/. [Accessed 10-10-2025]

work page 2024

[11] [11]

Awan-Ur-Rahman. 2023. Understanding Soft Voting and Hard Voting: A Comparative Analysis of Ensemble Learning Meth- ods. https://medium.com/@awanurrahman.cse/understanding-soft-voting-and-hard-voting-a-comparative-analysis-of-ensemble- learning-methods-db0663d2c008

work page 2023

[12] [12]

Oresti Banos, Juan-Manuel Galvez, Miguel Damas, Hector Pomares, and Ignacio Rojas. 2014. Window Size Impact in Human Activity Recognition.Sensors14, 4 (April 2014), 6474–6499. https://doi.org/10.3390/s140406474 Number: 4 Publisher: Multidisciplinary Digital Publishing Institute

work page doi:10.3390/s140406474 2014

[13] [13]

Sejal Bhalla, Mayank Goel, and Rushil Khurana. 2021. IMU2Doppler: Cross-Modal Domain Adaptation for Doppler-based Activity Recognition Using IMU Data.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies5, 4 (2021), 1–20

work page 2021

[14] [14]

Sarnab Bhattacharya, Rebecca Adaimi, and Edison Thomaz. 2022. Leveraging sound and wrist motion to detect activities of daily living with commodity smartwatches.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies6, 2 (2022), 42:1–42:28. https://doi.org/10.1145/3534582

work page doi:10.1145/3534582 2022

[15] [15]

Florian Bordes, Richard Yuanzhe Pang, Anurag Ajay, Alexander C. Li, Adrien Bardes, Suzanne Petryk, Oscar Mañas, Zhiqiu Lin, Anas Mahmoud, Bargav Jayaraman, Mark Ibrahim, Melissa Hall, Yunyang Xiong, Jonathan Lebensold, Candace Ross, Srihari Jayakumar, Chuan Guo, Diane Bouchacourt, Haider Al-Tahan, Karthik Padthe, Vasu Sharma, Hu Xu, Xiaoqing Ellen Tan, Me...

work page doi:10.48550/arxiv.2405.17247 2024

[16] [16]

Damien Bouchabou, Sao Mai Nguyen, Christophe Lohr, Benoit LeDuc, and Ioannis Kanellos. 2021. A Survey of Human Activity Recognition in Smart Homes Based on IoT Sensors Algorithms: Taxonomies, Challenges, and Opportunities with Deep Learning.Sensors (Basel, Switzerland)21, 18 (Sept. 2021), 6037. https://doi.org/10.3390/s21186037

work page doi:10.3390/s21186037 2021

[17] [17]

Bernheim Brush, Bongshin Lee, Ratul Mahajan, Sharad Agarwal, Stefan Saroiu, and Colin Dixon

A.J. Bernheim Brush, Bongshin Lee, Ratul Mahajan, Sharad Agarwal, Stefan Saroiu, and Colin Dixon. 2011. Home automation in the wild: challenges and opportunities. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’11). Association for Computing Machinery, New York, NY, USA, 2115–2124. https://doi.org/10.1145/1978942.1979249...

work page doi:10.1145/1978942.1979249 2011

[18] [18]

Timothy I Cannings, Yingying Fan, and Richard J Samworth. 2020. Classification with imperfect training labels.Biometrika107, 2 (2020), 311–330

work page 2020

[19] [19]

João Carreira and Andrew Zisserman. 2017. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. In2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. IEEE Computer Society, Honolulu, HI, USA, 4724–4733. https://doi.org/10.1109/CVPR.2017.502

work page doi:10.1109/cvpr.2017.502 2017

[20] [20]

Gabriele Cipriani, Sabrina Danti, Lucia Picchi, Angelo Nuti, and Mario Di Fiorino. 2020. Daily functioning and dementia.Dementia & Neuropsychologia14, 2 (2020), 93–102. https://doi.org/10.1590/1980-57642020dn14-020001

work page doi:10.1590/1980-57642020dn14-020001 2020

[21] [21]

Diane Cook, Narayanan Krishnan, and Parisa Rashidi. 2013. Activity Discovery and Activity Recognition: A New Partnership.IEEE transactions on cybernetics43, 3 (June 2013), 820–828. https://doi.org/10.1109/TSMCB.2012.2216873

work page doi:10.1109/tsmcb.2012.2216873 2013

[22] [22]

Ivan Culjak, David Abram, Tomislav Pribanic, Hrvoje Dzapo, and Mario Cifrek. 2012. A brief introduction to OpenCV. In2012 Proceedings of the 35th International Convention MIPRO. IEEE, Opatija, Croatia, 1725–1730

work page 2012

[23] [23]

Smith, and Flora D

Shohreh Deldari, Hao Xue, Aaqib Saeed, Jiayuan He, Daniel V. Smith, and Flora D. Salim. 2022. Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data. arXiv:2206.02353 [cs.LG]

work page arXiv 2022

[24] [24]

Kaikai Deng, Dong Zhao, Zihan Zhang, Shuyue Wang, Wenxin Zheng, and Huadong Ma. 2024. Midas++: Generating Training Data of mmWave Radars From Videos for Privacy-Preserving Human Sensing With Mobility.IEEE Transactions on Mobile Computing23, 6 (June 2024), 6650–6666. https://doi.org/10.1109/TMC.2023.3325399

work page doi:10.1109/tmc.2023.3325399 2024

[25] [25]

Nathan DeVrio, Vimal Mollyn, and Chris Harrison. 2023. SmartPoser: Arm Pose Estimation with a Smartphone and Smartwatch Using UWB and IMU Data. InProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST ’23). Association for Computing Machinery, New York, NY, USA, 1–11. https://doi.org/10.1145/3586183.3606821

work page doi:10.1145/3586183.3606821 2023

[26] [26]

Ha, Emma Russell, Haarika A

Megan V. Ha, Emma Russell, Haarika A. Reddy, Alexander K. Maytin, Dustin P. DeMeo, Riku Arakawa, Mayank Goel, Jill F. Lehman, and Bryan T. Carroll. 2024. Self-narration for patient monitoring with smartwatch technology in post-operative wound care after dermatologic surgery.Archives of Dermatological Research316, 7 (June 2024), 389. https://doi.org/10.100...

work page doi:10.1007/s00403-024-03149-z 2024

[27] [27]

Harris, K

Charles R. Harris, K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fernández del Río, Mark Wiebe, Pearu Peterson, Pierre Gérard-Marchant, Kevin Shepp...

work page doi:10.1038/s41586-020-2649-2 2020

[28] [28]

Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem, and Juan Carlos Niebles. 2015. ActivityNet: A large-scale video benchmark for human activity understanding. In2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Boston, MA, USA, 961–970. https://doi.org/10.1109/CVPR.2015.7298698

work page doi:10.1109/cvpr.2015.7298698 2015

[29] [29]

Hiremath, Yasutaka Nishimura, Sonia Chernova, and Thomas Plötz

Shruthi K. Hiremath, Yasutaka Nishimura, Sonia Chernova, and Thomas Plötz. 2022. Bootstrapping Human Activity Recognition Systems for Smart Homes from Scratch.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies6, 3 (Sept. 2022), 1–27. https://doi.org/10.1145/3550294

work page doi:10.1145/3550294 2022

[30] [30]

Hiremath and Thomas Plötz

Shruthi K. Hiremath and Thomas Plötz. 2023. The Lifespan of Human Activity Recognition Systems for Smart Homes.Sensors23, 18 (Jan. 2023), 7729. https://doi.org/10.3390/s23187729 Number: 18 Publisher: Multidisciplinary Digital Publishing Institute

work page doi:10.3390/s23187729 2023

[31] [31]

Yash Jain, Chi Ian Tang, Chulhong Min, Fahim Kawsar, and Akhil Mathur. 2022. ColloSSL: Collaborative Self-Supervised Learning for Human Activity Recognition.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.6, 1, Article 17 (mar 2022), 28 pages. https: //doi.org/10.1145/3517246

work page doi:10.1145/3517246 2022

[32] [32]

Ahmad Jalal, Shaharyar Kamal, and Daijin Kim. 2017. A Depth Video-based Human Detection and Activity Recognition using Multi- features and Embedded Hidden Markov Models for Health Care Monitoring Systems.International Journal of Interactive Multimedia and Artificial Intelligence4, Regular Issue (2017), 54–62. https://www.ijimai.org/journal/bibcite/reference/2606

work page 2017

[33] [33]

Tianjie Ju, Yi Hua, Hao Fei, Zhenyu Shao, Yubin Zheng, Haodong Zhao, Mong-Li Lee, Wynne Hsu, Zhuosheng Zhang, and Gongshen Liu. 2025. Watch Out Your Album! On the Inadvertent Privacy Memorization in Multi-Modal Large Language Models. https: //doi.org/10.48550/arXiv.2503.01208 arXiv:2503.01208 [cs]

work page doi:10.48550/arxiv.2503.01208 2025

[34] [34]

Alexander Karpekov, Sonia Chernova, and Thomas Plötz. 2025. DISCOVER: Data-driven Identification of Sub-activities via Clustering and Visualization for Enhanced Activity Recognition in Smart Homes. https://doi.org/10.48550/arXiv.2503.01733 arXiv:2503.01733 [cs]

work page doi:10.48550/arxiv.2503.01733 2025

[35] [35]

Hyeokhyen Kwon, Catherine Tong, Harish Haresamudram, Yan Gao, Gregory D Abowd, Nicholas D Lane, and Thomas Ploetz. 2020. IMUTube: Automatic extraction of virtual on-body accelerometry from video for human activity recognition.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies4, 3 (2020), 1–29

work page 2020

[36] [36]

Gierad Laput and Chris Harrison. 2019. SurfaceSight: A New Spin on Touch, User, and Object Sensing for IoT Experiences. InProceedings of the 2019 CHI Conference on Human Factors in Computing Systems(Glasgow, Scotland Uk)(CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300559

work page doi:10.1145/3290605.3300559 2019

[37] [37]

Gierad Laput, Yang Zhang, and Chris Harrison. 2017. Synthetic Sensors: Towards General-Purpose Sensing. InProc. of the 2017 CHI Conference on Human Factors in Computing Systems(Denver, Colorado, USA)(CHI ’17). ACM, New York, NY, USA, 3986–3999. https://doi.org/10.1145/3025453.3025773 Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 9, No. 4, Ar...

work page doi:10.1145/3025453.3025773 2017

[38] [38]

Guillaume Lemaître, Fernando Nogueira, and Christos K. Aridas. 2017. Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning.Journal of Machine Learning Research18, 17 (2017), 1–5. http://jmlr.org/papers/v18/16-365.html

work page 2017

[39] [39]

Zikang Leng, Amitrajit Bhattacharjee, Hrudhai Rajasekhar, Lizhe Zhang, Elizabeth Bruda, Hyeokhyen Kwon, and Thomas Plötz. 2024. IMUGPT 2.0: Language-Based Cross Modality Transfer for Sensor-Based Human Activity Recognition.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies8, 3 (Aug. 2024), 1–32. https://doi.org/10.1145/3678545

work page doi:10.1145/3678545 2024

[40] [40]

Zikang Leng, Hyeokhyen Kwon, and Thomas Ploetz. 2023. Generating Virtual On-body Accelerometer Data from Virtual Textual Descriptions for Human Activity Recognition. InProceedings of the 2023 ACM International Symposium on Wearable Computers (ISWC ’23). Association for Computing Machinery, New York, NY, USA, 39–43. https://doi.org/10.1145/3594738.3611361

work page doi:10.1145/3594738.3611361 2023

[41] [41]

Zikang Leng, Hyeokhyen Kwon, and Thomas Plötz. 2023. On the Benefit of Generative Foundation Models for Human Activity Recognition. https://doi.org/10.48550/arXiv.2310.12085 arXiv:2310.12085 [cs]

work page doi:10.48550/arxiv.2310.12085 2023

[42] [42]

Dawei Liang, Guihong Li, Rebecca Adaimi, Radu Marculescu, and Edison Thomaz. 2022. AudioIMU: Enhancing Inertial Sensing-Based Activity Recognition with Acoustic Models. InProceedings of the 2022 ACM International Symposium on Wearable Computers(Cambridge, United Kingdom)(ISWC ’22). Association for Computing Machinery, New York, NY, USA, 44–48. https://doi...

work page doi:10.1145/3544794.3558471 2022

[43] [43]

Sicong Liu, Junzhao Du, Anshumali Shrivastava, and Lin Zhong. 2019. Privacy Adversarial Network.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies3, 4 (dec 2019), 1–18. https://doi.org/10.1145/3369816

work page doi:10.1145/3369816 2019

[44] [44]

Tian-Yu Liu. 2009. EasyEnsemble and Feature Selection for Imbalance Data Sets. In2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing. IEEE, Shanghai, China, 517–520. https://doi.org/10.1109/IJCBS.2009.22

work page doi:10.1109/ijcbs.2009.22 2009

[45] [45]

Harsh Lunia. 2024. Can VLMs be used on videos for action recognition? LLMs are Visual Reasoning Coordinators. https://doi.org/10. 48550/arXiv.2407.14834 arXiv:2407.14834 [cs] version: 1

work page arXiv 2024

[46] [46]

Leland McInnes, John Healy, and Steve Astels. 2017. hdbscan: Hierarchical density based clustering.The Journal of Open Source Software 2, 11 (March 2017), 205. https://doi.org/10.21105/joss.00205

work page doi:10.21105/joss.00205 2017

[47] [47]

Mites.io. 2020. Mites.io: a full-stack ubiquitous sensing platform. https://mites.io/

work page 2020

[48] [48]

MMAction2. 2020. OpenMMLab’s Next Generation Video Understanding Toolbox and Benchmark. https://github.com/open-mmlab/ mmaction2

work page 2020

[49] [49]

MMPose. 2020. OpenMMLab Pose Estimation Toolbox and Benchmark. https://github.com/open-mmlab/mmpose

work page 2020

[50] [50]

Vimal Mollyn, Karan Ahuja, Dhruv Verma, Chris Harrison, and Mayank Goel. 2022. SAMoSA: Sensing Activities with Motion and Subsampled Audio.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies6, 3 (2022), 1–19

work page 2022

[51] [51]

Vimal Mollyn, Riku Arakawa, Mayank Goel, Chris Harrison, and Karan Ahuja. 2023. IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and Earbuds. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3544548.3581392

work page doi:10.1145/3544548.3581392 2023

[52] [52]

Sebastian Münzner, Philip Schmidt, Attila Reiss, Michael Hanselmann, Rainer Stiefelhagen, and Robert Dürichen. 2017. CNN-Based Sensor Fusion Techniques for Multimodal Human Activity Recognition. InProceedings of the 2017 ACM International Symposium on Wearable Computers(Maui, Hawaii)(ISWC ’17). Association for Computing Machinery, New York, NY, USA, 158–1...

work page doi:10.1145/3123021.3123046 2017

[53] [53]

OpenAI. 2025. OpenAI API. https://platform.openai.com/docs/api-reference/ Accessed: 2025-04-29

work page 2025

[54] [54]

2020, doi: 10.5281/zenodo.3509134

The pandas development team. 2020.pandas-dev/pandas: Pandas. pandas-dev. https://doi.org/10.5281/zenodo.3509134

work page doi:10.5281/zenodo.3509134 2020

[55] [55]

Preksha Pareek and Ankit Thakkar. 2021. A survey on video-based human action recognition: recent updates, datasets, challenges, and applications.Artificial Intelligence Review54, 3 (2021), 2259–2322

work page 2021

[56] [56]

Prasoon Patidar, Mayank Goel, and Yuvraj Agarwal. 2023. VAX: Using Existing Video and Audio-based Activity Recognition Models to Bootstrap Privacy-Sensitive Sensors.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies7, 3 (Sept. 2023), 1–24. https://doi.org/10.1145/3610907

work page doi:10.1145/3610907 2023

[57] [57]

Pedregosa, G

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python.Journal of Machine Learning Research12 (2011), 2825–2830

work page 2011

[58] [58]

Daniel Perazzo, Natalia Souza Soares, Victor Gouveia de Menezes Lyra, Gustavo Camargo Rocha Lima, Alana Elza Fontes da Gama, Joao Marcelo Xavier Natario Teixeira, and Veronica Teichrieb. 2022. OAK-D as a Platform for Human Movement Analysis: A Case Study. InProceedings of the 23rd Symposium on Virtual and Augmented Reality(Virtual Event, Brazil)(SVR ’21)....

work page doi:10.1145/3488162.3488222 2022

[59] [59]

Prasoon Patidar, Riku Arakawa, Mayank Goel, Yuvraj Agarwal. 2025. OrganicHAR: Open-source repository for the OrganicHAR. https://github.com/synergylabs/OrganicHAR

work page 2025

[60] [60]

Riccardo Presotto, Gabriele Civitarese, and Claudio Bettini. 2022. Federated Clustering and Semi-Supervised learning: A new partnership for personalized Human Activity Recognition.Pervasive and Mobile Computing88 (2022), 101726

work page 2022

[61] [61]

Suneth Ranasinghe, Fadi Al Machot, and Heinrich C Mayr. 2016. A review on applications of activity recognition systems with regard to performance and evaluation.International Journal of Distributed Sensor Networks12, 8 (2016), 1550147716665520. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 9, No. 4, Article 203. Publication date: December 20...

work page 2016

[62] [62]

Juan Rocamonde, Victoriano Montesinos, Elvis Nava, Ethan Perez, and David Lindner. 2024. Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning. https://doi.org/10.48550/arXiv.2310.12921 arXiv:2310.12921 [cs] version: 2

work page doi:10.48550/arxiv.2310.12921 2024

[63] [63]

Laurens Samson, Nimrod Barazani, Sennay Ghebreab, and Yuki M. Asano. 2025. Little Data, Big Impact: Privacy-Aware Visual Language Models via Minimal Tuning. https://doi.org/10.48550/arXiv.2405.17423 arXiv:2405.17423 [cs]

work page doi:10.48550/arxiv.2405.17423 2025

[64] [64]

Khoshgoftaar, Jason Van Hulse, and Amri Napolitano

Chris Seiffert, Taghi M. Khoshgoftaar, Jason Van Hulse, and Amri Napolitano. 2010. RUSBoost: A Hybrid Approach to Alleviating Class Imbalance.IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans40, 1 (Jan. 2010), 185–197. https://doi.org/10.1109/TSMCA.2009.2029559

work page doi:10.1109/tsmca.2009.2029559 2010

[65] [65]

Pekka Siirtola and Juha Röning. 2019. Incremental Learning to Personalize Human Activity Recognition Models: The Importance of Human AI Collaboration.Sensors (Basel, Switzerland)19, 23 (Nov. 2019), 5151. https://doi.org/10.3390/s19235151

work page doi:10.3390/s19235151 2019

[66] [66]

Adane Nega Tarekegn, Mohib Ullah, Faouzi Alaya Cheikh, and Muhammad Sajjad. 2023. Enhancing Human Activity Recognition Through Sensor Fusion And Hybrid Deep Learning Model. In2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW). IEEE, Rhodes Island, Greece, 1–5. https://doi.org/10.1109/ICASSPW59220.2023.10193698

work page doi:10.1109/icasspw59220.2023.10193698 2023

[67] [67]

Maytin, Yash Kumar, Toluwalashe Onamusi, Haarika A

Annalise Vaccarello, Alexander K. Maytin, Yash Kumar, Toluwalashe Onamusi, Haarika A. Reddy, Mayank Goel, Riku Arakawa, Jill Fain Lehman, and Bryan T. Carroll. 2024. Barriers to use of digital assistance for postoperative wound care: a single-center survey of dermatologic surgery patients.Archives of Dermatological Research316, 7 (June 2024), 376. https:/...

work page doi:10.1007/s00403-024-03025-w 2024

[68] [68]

Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, St´ efan J

Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, İlhan Polat, Yu Feng, Eric W. Mo...

work page doi:10.1038/s41592-019-0686-2 2020

[69] [69]

Michalis Vrigkas, Christophoros Nikou, and Ioannis A Kakadiaris. 2015. A review of human activity recognition methods.Frontiers in Robotics and AI2 (2015), 28

work page 2015

[70] [70]

Fali Wang, Zhiwei Zhang, Xianren Zhang, Zongyu Wu, Tzuhao Mo, Qiuhao Lu, Wanjing Wang, Rui Li, Junjie Xu, Xianfeng Tang, Qi He, Yao Ma, Ming Huang, and Suhang Wang. 2024. A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness. arXiv:2411.0335...

work page arXiv 2024

[71] [71]

Shuai Wang, Luoyu Mei, Ruofeng Liu, Wenchao Jiang, Zhimeng Yin, Xianjun Deng, and Tian He. 2025. Multi-Modal Fusion Sensing: A Comprehensive Review of Millimeter-Wave Radar and Its Integration With Other Modalities.IEEE Commun. Surv. Tutorials27, 1 (2025), 322–352. https://doi.org/10.1109/COMST.2024.3398004

work page doi:10.1109/comst.2024.3398004 2025

[72] [72]

Pete Warden, Matthew Stewart, Brian Plancher, Colby Banbury, Shvetank Prakash, Emma Chen, Zain Asgar, Sachin Katti, and Vijay Janapa Reddi. 2022. Machine Learning Sensors. https://doi.org/10.48550/ARXIV.2206.03266

work page doi:10.48550/arxiv.2206.03266 2022

[73] [73]

Why is ’Chicago’ deceptive?

Jason Wu, Chris Harrison, Jeffrey P. Bigham, and Gierad Laput. 2020. Automated Class Discovery and One-Shot Interactions for Acoustic Activity Recognition. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems. ACM, Honolulu HI USA, 1–14. https://doi.org/10.1145/3313831.3376875

work page doi:10.1145/3313831.3376875 2020

[74] [74]

Tong Wu, Murtadha Aldeer, Tahiya Chowdhury, Amber Haynes, Fateme Nikseresht, Mahsa Pahlavikhah Varnosfaderani, Jiechao Gao, Arsalan Heydarian, Brad Campbell, and Jorge Ortiz. 2021. The Smart Building Privacy Challenge. InProceedings of the 8th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation(Coimbra, Portu...

work page doi:10.1145/3486611.3492234 2021

[75] [75]

Chengshuo Xia, Xinrui Fang, Riku Arakawa, and Yuta Sugiura. 2022. VoLearn: A Cross-Modal Operable Motion-Learning System Combined with Virtual Avatar and Auditory Feedback.Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.6, 2 (2022), 81:1–81:26. https://doi.org/10.1145/3534576

work page doi:10.1145/3534576 2022

[76] [76]

In: Annals of Operations Research

Kenji Yamanishi, Jun’ichi Takeuchi, Graham J. Williams, and Peter Milne. 2004. On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms.Data Mining and Knowledge Discovery8, 3 (2004), 275–300. https://doi.org/10.1023/B: DAMI.0000023676.72185.7c

work page doi:10.1023/b: 2004

[77] [77]

Murat Yağcı, Tevfik Aytekin, and Fikret S

A. Murat Yağcı, Tevfik Aytekin, and Fikret S. Gürgen. 2016. Balanced random forest for imbalanced data streams. In2016 24th Signal Processing and Communication Application Conference (SIU). IEEE, Zonguldak, Turkey, 1065–1068. https://doi.org/10.1109/SIU.2016. 7495927

work page doi:10.1109/siu.2016 2016

[78] [78]

Nguyen, Taesik Gong, and Sung-Ju Lee

Hyungjun Yoon, Hyeongheon Cha, Hoang C. Nguyen, Taesik Gong, and Sung-Ju Lee. 2024. IMG2IMU: Translating Knowledge from Large-Scale Images to IMU Sensing Applications. https://doi.org/10.48550/arXiv.2209.00945 arXiv:2209.00945 [cs]

work page doi:10.48550/arxiv.2209.00945 2024

[79] [79]

Sojeong Yun and Youn-kyung Lim. 2025. What If Smart Homes Could See Our Homes?: Exploring DIY Smart Home Building Experiences with VLM-Based Camera Sensors. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, USA, 1–22. https://doi.org/10.1145/3706598.3713265

work page doi:10.1145/3706598.3713265 2025

[80] [80]

Shugang Zhang, Zhiqiang Wei, Jie Nie, Lei Huang, Shuang Wang, and Zhen Li. 2017. A Review on Human Activity Recognition Using Vision-Based Method.Journal of Healthcare Engineering2017, 1 (2017), 3090343. https://doi.org/10.1155/2017/3090343 Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 9, No. 4, Article 203. Publication date: December 2025. ...

work page doi:10.1155/2017/3090343 2017