From Coordinates to Context: An LLM-Bootstrapped Semantic Encoding Framework for Privacy-Preserving Mobile Sensing Stress Recognition

Hoang Khang Phan; Nhat Tan Le

arxiv: 2511.23200 · v2 · submitted 2025-11-28 · 💻 cs.CR · cs.HC

From Coordinates to Context: An LLM-Bootstrapped Semantic Encoding Framework for Privacy-Preserving Mobile Sensing Stress Recognition

Hoang Khang Phan , Nhat Tan Le This is my paper

Pith reviewed 2026-05-17 04:50 UTC · model grok-4.3

classification 💻 cs.CR cs.HC

keywords privacy-preserving mobile sensingstress recognitionsemantic location encodingLLM bootstrapped featuresGPS data transformationprivacy-utility-explainabilitymobile sensingexplainable stress detection

0 comments

The pith

A privacy-aware model for recognizing stress from mobile location data performs statistically the same as a non-private model while adding strong protection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents an end-to-end framework that converts raw GPS coordinates into human-readable semantic features using a self-hosted map engine and an LLM-generated static map. This transformation supports privacy-preserving dataset sharing and stress recognition without exposing exact locations. Through leave-one-subject-out validation, the privacy-aware model shows no statistical difference in stress detection performance compared to a standard non-private approach. Model explanations confirm that the semantic features align with established psychological factors linked to stress. Ablation tests on the GeoLife dataset indicate the framework improves privacy measures by a factor of two to three over non-private baselines.

Core claim

The paper claims that its Privacy-Aware model achieves robust privacy protection without being statistically distinguishable in stress recognition performance from a non-private model, as shown via LOSO validation, while the extracted user-friendly semantic features match psychological literature on stress and the overall framework improves privacy by 2-3 times on the GeoLife dataset compared to non-privacy-aware methods.

What carries the argument

The LLM-bootstrapped static map that converts raw coordinates into privacy-preserving semantic location features for stress-related analysis.

Load-bearing premise

The LLM-bootstrapped static map produces semantic features that are both generalizable across users and directly relevant to psychological stress factors.

What would settle it

Demonstrating that the privacy-aware model's stress recognition accuracy falls significantly below the non-private baseline on an independent, diverse user dataset would falsify the equivalence claim.

Figures

Figures reproduced from arXiv: 2511.23200 by Hoang Khang Phan, Nhat Tan Le.

**Figure 1.** Figure 1: The comparison of the current common GPS feature extraction method and our method. The red oval shape represents the [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: The top-k-accuracy of the re-identification attack in the [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: The SHAP beeswarm summary visualization for the stressed classification outcome in RF in AF scenario. [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗

**Figure 4.** Figure 4: The line plot of recreational activity time (in seconds) of stress and non-stress students by week. Stressed student is represented [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: The line plot of workplace time (in seconds) of stress and non-stress students by week. Stressed student is represented by level [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: The extended re-identification attacks accuracy (top 1 accuracy (top 5 accuracy)) in 3 scenario (in percent). [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗

read the original abstract

Psychological stress is a widespread issue that significantly impacts student well-being and academic performance. Effective remote stress recognition is crucial, yet existing methods often rely on wearable devices or GPS-based clustering techniques that pose privacy risks and lack of human understandable explanations. In this study, we introduce a novel, end-to-end privacy-enhanced framework for semantic location encoding using a self-hosted OSM engine and an LLM-bootstrapped static map for human-friendly feature extraction, and pave a pathway for privacy-aware location data transformation for dataset sharing. We rigorously quantify the privacy-utility-explainability trilemma and demonstrate (via LOSO validation) that our Privacy-Aware (PA) model achieves robust privacy protection without being statistically distinguishable in stress recognition performance from a non-private model. Model explanation analysis highlights that our extracted features, which are user-friendly features, match with psychological literature about stress. In addition, an ablation study on the GeoLife dataset also demonstrates that our privacy framework improves privacy by 2-3 times compared to a non-privacy-aware approach. This suggests that our system can be utilized for the next generation of GPS transformations in open-source datasets for future researchers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a workable pipeline for turning GPS traces into semantic features via self-hosted OSM and an LLM so location data can be shared with less privacy risk while still supporting stress models, but the statistical checks on preserved utility are still light.

read the letter

The main thing here is a concrete pipeline that converts raw coordinates into human-readable semantic tags using a self-hosted OpenStreetMap engine plus an LLM to build a static map. The goal is to let researchers release location data for stress studies without handing over exact positions, and they test it on stress recognition with LOSO splits on what looks like GeoLife traces. They also try to measure the privacy-utility-explainability balance and report that the privacy-aware version is not statistically distinguishable in performance from the raw version while gaining 2-3 times on their privacy metric in an ablation. The explanation part ties the features back to known psychological stress factors, which is a reasonable check. That combination of self-hosted OSM and LLM bootstrapping for this specific use case in mobile sensing is the clearest new piece; earlier GPS work mostly stuck to clustering or raw coordinates, and wearable-only approaches skip the location angle altogether. The framework description itself is straightforward and could be useful for anyone who needs to transform GPS traces for open datasets. The soft spots sit mostly in the missing details around the numbers. The abstract does not show error bars, the exact statistical test used for the indistinguishability claim, or a full breakdown of how the privacy metric was computed, so it is hard to judge how robust the parity result really is. There is also the usual post-hoc selection risk when features are derived after the fact. On the generalizability side the stress-test note raises a fair point: without something like cross-user embedding stability checks for the same OSM tags, it is possible the LLM map picked up dataset-specific patterns rather than truly user-independent semantics. If the full methods section has those checks or at least reports variance across multiple map constructions, that would tighten things up. This is aimed at people working on privacy-preserving mobile health sensing and dataset sharing. A reader who needs a practical way to release location data for stress or similar tasks would get a usable starting point even if they end up tweaking the LLM prompting. It is worth sending to a serious referee because the core engineering idea is clear, the problem is real, and the reported numbers are at least directionally interesting; the review would mainly need to press on the statistical gaps and the cross-user stability question rather than reject the framing outright.

Referee Report

3 major / 2 minor

Summary. The paper introduces an end-to-end privacy-enhanced framework for semantic location encoding that combines a self-hosted OSM engine with an LLM-bootstrapped static map to extract human-friendly features from GPS data. It claims that the resulting Privacy-Aware (PA) model, evaluated via LOSO validation on the GeoLife dataset, achieves robust privacy protection while delivering stress-recognition performance that is statistically indistinguishable from a non-private baseline, together with a 2–3× improvement in privacy metrics and feature explanations that align with psychological stress literature.

Significance. If the central performance-parity and privacy-gain claims hold under rigorous statistical scrutiny, the work would provide a practical route for releasing location datasets that preserve utility for downstream stress-detection tasks while substantially reducing re-identification risk. The explicit quantification of the privacy-utility-explainability trilemma and the use of an open OSM + LLM pipeline are positive steps toward reproducible, human-interpretable privacy transformations.

major comments (3)

Abstract and §4 (LOSO validation): the claim that the PA model is 'not statistically distinguishable' from the non-private model is load-bearing for the utility-preservation argument, yet the manuscript provides neither error bars on the reported accuracies, the exact statistical test employed, nor the p-value threshold used to assert indistinguishability.
§3.2 (LLM-bootstrapped static map) and §5 (explanation analysis): the assertion that the extracted semantic features remain predictive across held-out users rests on the unverified assumption that the LLM mapping produces user-independent embeddings for identical OSM tags; no quantitative stability metric (e.g., cosine similarity of embeddings for the same tag across LOSO folds) is reported, leaving open the possibility that observed parity is an artifact of the particular data partition.
§4.3 (privacy metrics): the reported 2–3× privacy improvement is central to the trilemma claim, but the precise definition of the privacy metric, the adversary model, and the computation details (including any post-hoc feature selection) are not specified, preventing independent verification of the gain.

minor comments (2)

Notation for the privacy-utility-explainability trilemma should be introduced with an equation or explicit definition in §2 rather than left implicit.
Figure captions and axis labels in the ablation study should explicitly state the privacy metric being plotted to avoid ambiguity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important areas for improving statistical rigor, reproducibility, and clarity in our claims regarding performance parity, feature stability, and privacy gains. We address each major comment below and commit to specific revisions in the next manuscript version.

read point-by-point responses

Referee: Abstract and §4 (LOSO validation): the claim that the PA model is 'not statistically distinguishable' from the non-private model is load-bearing for the utility-preservation argument, yet the manuscript provides neither error bars on the reported accuracies, the exact statistical test employed, nor the p-value threshold used to assert indistinguishability.

Authors: We agree that the current presentation of the indistinguishability claim is insufficient without supporting statistical details. In the revised manuscript we will add error bars (standard deviations computed across the 10 LOSO folds) to all accuracy figures in §4. We will explicitly state that a paired t-test was performed on the per-fold accuracies between the PA and non-private models, report the resulting p-values, and adopt p > 0.05 as the threshold for declaring no statistically significant difference. These additions will also be summarized concisely in the abstract. revision: yes
Referee: §3.2 (LLM-bootstrapped static map) and §5 (explanation analysis): the assertion that the extracted semantic features remain predictive across held-out users rests on the unverified assumption that the LLM mapping produces user-independent embeddings for identical OSM tags; no quantitative stability metric (e.g., cosine similarity of embeddings for the same tag across LOSO folds) is reported, leaving open the possibility that observed parity is an artifact of the particular data partition.

Authors: We acknowledge that a direct stability metric is needed to substantiate user-independence of the LLM-generated embeddings. In the revision we will compute, for every OSM tag that appears in multiple folds, the cosine similarity between the embeddings produced by the LLM in each LOSO training partition. We will report the mean and standard deviation of these similarities in a new paragraph in §3.2 and reference the result when discussing feature stability in §5. This analysis will be performed on the same static map used for the main experiments. revision: yes
Referee: §4.3 (privacy metrics): the reported 2–3× privacy improvement is central to the trilemma claim, but the precise definition of the privacy metric, the adversary model, and the computation details (including any post-hoc feature selection) are not specified, preventing independent verification of the gain.

Authors: We agree that the privacy evaluation section requires greater specificity. In the revised §4.3 we will (1) define the privacy metric as the reduction in an adversary’s re-identification accuracy when the adversary is given the transformed semantic features and attempts to match them to the original GPS trajectories, (2) describe the adversary model (a logistic regression classifier trained on a held-out portion of the original GeoLife data), and (3) detail the exact computation, including that no post-hoc feature selection was applied beyond the semantic encoding itself. The 2–3× factor will be expressed as the ratio of re-identification accuracies before and after the privacy transformation. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims rest on external LOSO validation and ablation, not self-referential reduction

full rationale

The paper describes an applied ML framework that transforms GPS coordinates into LLM-derived semantic features for stress recognition while adding privacy mechanisms. Its strongest claim (PA model performance statistically indistinguishable from non-private baseline) is asserted via LOSO validation and a separate ablation study on the GeoLife dataset showing 2-3x privacy improvement. These evaluations are presented as independent checks performed after feature extraction, not as quantities fitted or defined in terms of the target metric. No equations, uniqueness theorems, or self-citations are invoked to force the outcome by construction. The LLM static map is treated as a fixed preprocessing step whose utility is then measured externally; any limitations in cross-user stability would affect evidence strength but do not create a definitional or fitted-input loop within the reported derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that semantic features derived from the LLM map capture stress-relevant context and that privacy metrics are independent of the downstream classifier performance.

axioms (1)

domain assumption LLM-generated semantic map produces features that align with psychological literature on stress and generalize across users
Invoked in the model explanation analysis and claim of human-friendly features matching literature.

pith-pipeline@v0.9.0 · 5509 in / 1279 out tokens · 25959 ms · 2026-05-17T04:50:16.012679+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

self-hosted OSM engine and an LLM-bootstrapped static map for human-friendly feature extraction... Privacy-Aware (PA) model achieves robust privacy protection without being statistically distinguishable in stress recognition performance

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages · 1 internal anchor

[1]

Saeed Abdullah and Tanzeem Choudhury. 2018. Sensing technologies for monitoring serious mental illnesses.IEEE MultiMedia25, 1 (2018), 61–75

work page 2018
[2]

Yasin Acikmese and S Emre Alptekin. 2019. Prediction of stress levels with LSTM and passive mobile sensors.Procedia Computer Science159 (2019), 658–667

work page 2019
[3]

Stress: statistics

American Psychiatric Association. 2018. "Stress: statistics". https://www.mentalhealth.org.uk/explore-mental-health/statistics/stress-statistics. Accessed: 2025-03-08

work page 2018
[4]

bad” days are more biased than memories of “good

Charles S Areni and Mitchell Burger. 2008. Memories of “bad” days are more biased than memories of “good” days: Past Saturdays vary, but past Mondays are always blue.Journal of Applied Social Psychology38, 6 (2008), 1395–1415

work page 2008
[5]

Raiyan Abdul Baten, Yozen Liu, Heinrich Peters, Francesco Barbieri, Neil Shah, Leonardo Neves, and Maarten W Bos. 2023. Predicting future location categories of users in a large social platform. InProceedings of the International AAAI Conference on Web and Social Media, Vol. 17. 47–58

work page 2023
[6]

Irene Bonafonte, Cristina Bustos, Abraham Larrazolo, Gilberto Lorenzo Martínez Luna, Adolfo Guzmán Arenas, Xavier Baró, Isaac Tourgeman, Mercedes Balcells, and Agata Lapedriza. 2023. Analyzing the contribution of different passively collected data to predict Stress and Depression. In 2023 11th International Conference on Affective Computing and Intelligen...

work page 2023
[7]

Leo Breiman. 2001. Random forests.Machine learning45, 1 (2001), 5–32

work page 2001
[8]

Chawla, Kevin W

Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique.J. Artif. Int. Res.16, 1 (June 2002), 321–357

work page 2002
[9]

Jiawei Chen, Janusz Konrad, and Prakash Ishwar. 2018. Vgan-based image representation learning for privacy-preserving facial expression recognition. InProceedings of the IEEE conference on computer vision and pattern recognition workshops. 1570–1579. Manuscript submitted to ACM 18 Hoang Khang Phan and Nhat Tan Le

work page 2018
[10]

Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16). ACM, 785–794. https://doi.org/10.1145/2939672.2939785

work page doi:10.1145/2939672.2939785 2016
[11]

Alex W DaSilva, Jeremy F Huckins, Rui Wang, Weichen Wang, Dylan D Wagner, and Andrew T Campbell. 2019. sch.JMIR mHealth and uHealth7, 3 (2019), e12084

work page 2019
[12]

Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. InProceedings of the Second International Conference on Knowledge Discovery and Data Mining(Portland, Oregon)(KDD’96). AAAI Press, 226–231

work page 1996
[13]

Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, and Michael Moeller. 2020. Inverting gradients-how easy is it to break privacy in federated learning?Advances in neural information processing systems33 (2020), 16937–16947

work page 2020
[14]

Anoushka Harit, Zhongtian Sun, Jongmin Yu, and Noura Al Moubayed. 2024. Monitoring Behavioral Changes Using Spatiotemporal Graphs: A Case Study on the StudentLife Dataset. InNeurIPS 2024 Workshop on Behavioral Machine Learning. https://openreview.net/forum?id=WyAKFNSYl2

work page 2024
[15]

Samiul Hasan, Xianyuan Zhan, and Satish V Ukkusuri. 2013. Understanding urban human activity and mobility patterns using large-scale location-based data from online social media. InProceedings of the 2nd ACM SIGKDD international workshop on urban computing. 1–8

work page 2013
[16]

Nutthaporn Junsomboon and Tanasanee Phienthrakul. 2017. Combining Over-Sampling and Under-Sampling Techniques for Imbalance Dataset. In Proceedings of the 9th International Conference on Machine Learning and Computing(Singapore, Singapore)(ICMLC ’17). Association for Computing Machinery, New York, NY, USA, 243–247. https://doi.org/10.1145/3055635.3056643

work page doi:10.1145/3055635.3056643 2017
[17]

Lin Sze Khoo, Mei Kuan Lim, Chun Yong Chong, and Roisin McNaney. 2024. Machine learning for multimodal mental health detection: a systematic review of passive sensing approaches.Sensors24, 2 (2024), 348

work page 2024
[18]

Mika Koivisto and Simone Grassini. 2023. Mental imagery of nature induces positive psychological effects.Current Psychology42, 34 (2023), 30348–30363

work page 2023
[19]

Bo Liu, Wanlei Zhou, Tianqing Zhu, Longxiang Gao, and Yong Xiang. 2018. Location privacy and its applications: A systematic study.IEEE access6 (2018), 17606–17624

work page 2018
[20]

Yifan Liu, Chenchen Kuai, Xishun Liao, Haoxuan Ma, Brian Yueshuai He, and Jiaqi Ma. 2024. Semantic trajectory data mining with llm-informed poi classification. In2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 207–213

work page 2024
[21]

Yunfei Luo, Iman Deznabi, Abhinav Shaw, Natcha Simsiri, Tauhidur Rahman, and Madalina Fiterau. 2024. Dynamic clustering via branched deep learning enhances personalization of stress prediction from mobile sensor data.Scientific Reports14, 1 (2024), 6631

work page 2024
[22]

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. InArtificial intelligence and statistics. PMLR, 1273–1282

work page 2017
[23]

Eric A Morris and Erick Guerra. 2015. Are we there yet? Trip duration and mood during travel.Transportation research part F: traffic psychology and behaviour33 (2015), 38–47

work page 2015
[24]

HUCKINS, COURTNEY ROGERS, MEGHAN L

Subigya Nepal, Wenjun Liu, Arvind Pillai, Weichen Wang, Vlado Vojdanovski, Jeremy F. Huckins, Courtney Rogers, Meghan L. Meyer, and Andrew T. Campbell. 2024. Capturing the College Experience: A Four-Year Mobile Sensing Study of Mental Health, Resilience and Behavior of College Students during the Pandemic.Proc. ACM Interact. Mob. Wearable Ubiquitous Techn...

work page doi:10.1145/3643501 2024
[25]

Agnieszka Olszewska-Guizzo, Angelia Sia, Anna Fogel, and Roger Ho. 2022. Features of urban green spaces associated with positive emotions, mindfulness and relaxation.Scientific reports12, 1 (2022), 20695

work page 2022
[26]

OpenStreetMap contributors. 2017. Planet dump retrieved from https://planet.osm.org . https://www.openstreetmap.org

work page 2017
[27]

Katarzyna Siła-Nowicka, Jan Vandrol, Taylor Oshan, Jed A Long, Urška Demšar, and A Stewart Fotheringham. 2016. Analysis of human mobility patterns from GPS trajectories and contextual information.International Journal of Geographical Information Science30, 5 (2016), 881–906

work page 2016
[28]

Samuel Sousa and Roman Kern. 2023. How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processing.Artificial Intelligence Review56, 2 (2023), 1427–1492

work page 2023
[29]

Arthur A Stone, Stefan Schneider, and James K Harter. 2012. Day-of-week mood patterns in the United States: On the existence of ‘Blue Monday’, ‘Thank God it’s Friday’and weekend effects.The Journal of Positive Psychology7, 4 (2012), 306–314

work page 2012
[30]

Gemini Team. 2024. Gemini: A Family of Highly Capable Multimodal Models. arXiv:2312.11805 [cs.CL] https://arxiv.org/abs/2312.11805

work page internal anchor Pith review Pith/arXiv arXiv 2024
[31]

Workplace Stress

The American Institute of Stress. 2018. "Workplace Stress". "https://www.mentalhealth.org.uk/explore-mental-health/statistics/stress-statistics". Accessed: 2025-03-08

work page 2018
[32]

transformingeducation.org. 2024. Student Stress Statistics [2024 Update]. https://transformingeducation.org/student-stress-statistics/ Accessed: 2025-03-08

work page 2024
[33]

Julio Vega, Meng Li, Kwesi Aguillera, Nikunj Goel, Echhit Joshi, Kirtiraj Khandekar, Krina C Durica, Abhineeth R Kunta, and Carissa A Low. 2021. Reproducible analysis pipeline for data streams: open-source software to process data collected with mobile devices.Frontiers in digital health3 (2021), 769823

work page 2021
[34]

Campbell

Rui Wang, Fanglin Chen, Zhenyu Chen, Tianxing Li, Gabriella Harari, Stefanie Tignor, Xia Zhou, Dror Ben-Zeev, and Andrew T. Campbell. 2014. StudentLife: assessing mental health, academic performance and behavioral trends of college students using smartphones. InProceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computin...

work page doi:10.1145/2632048.2632054 2014
[35]

Weichen Wang, Subigya Nepal, Jeremy F. Huckins, Lessley Hernandez, Vlado Vojdanovski, Dante Mack, Jane Plomp, Arvind Pillai, Mikio Obuchi, Alex daSilva, Eilis Murphy, Elin Hedlund, Courtney Rogers, Meghan Meyer, and Andrew Campbell. 2022. First-Gen Lens: Assessing Mental Health Manuscript submitted to ACM Quantifying the Privacy-Utility Trade-off in GPS-b...

work page doi:10.1145/3543194 2022
[36]

Yonghui Xiao and Li Xiong. 2015. Protecting locations with differential privacy under temporal correlations. InProceedings of the 22nd ACM SIGSAC conference on computer and communications security. 1298–1309

work page 2015
[37]

Kuehn, Jeremy F

Xuhai Xu, Xin Liu, Han Zhang, Weichen Wang, Subigya Nepal, Yasaman Sefidgar, Woosuk Seo, Kevin S. Kuehn, Jeremy F. Huckins, Margaret E. Morris, Paula S. Nurius, Eve A. Riskin, Shwetak Patel, Tim Althoff, Andrew Campbell, Anind K. Dey, and Jennifer Mankoff. 2023. GLOBEM: Cross-Dataset Generalization of Longitudinal Human Behavior Modeling.Proc. ACM Interac...

work page doi:10.1145/3569485 2023
[38]

Abbas Yazdinejad, Ali Dehghantanha, Hadis Karimipour, Gautam Srivastava, and Reza M Parizi. 2024. A robust privacy-preserving federated learning model against model poisoning attacks.IEEE Transactions on Information Forensics and Security19 (2024), 6693–6708

work page 2024
[39]

Ling Yin, Qian Wang, Shih-Lung Shaw, Zhixiang Fang, Jinxing Hu, Ye Tao, and Wei Wang. 2015. Re-identification risk versus data utility for aggregated mobility research using mobile phone location data.PloS one10, 10 (2015), e0140589

work page 2015
[40]

Ligeng Zhu, Zhijian Liu, and Song Han. 2019. Deep leakage from gradients.Advances in neural information processing systems32 (2019). A Ablation studies A.1 Ablation Study Experiments To isolate the predictive power of different feature groups, we designed an ablation study. In addition to the full PA (Privacy-Aware) and AF (All-Features) models, we evalua...

work page 2019

[1] [1]

Saeed Abdullah and Tanzeem Choudhury. 2018. Sensing technologies for monitoring serious mental illnesses.IEEE MultiMedia25, 1 (2018), 61–75

work page 2018

[2] [2]

Yasin Acikmese and S Emre Alptekin. 2019. Prediction of stress levels with LSTM and passive mobile sensors.Procedia Computer Science159 (2019), 658–667

work page 2019

[3] [3]

Stress: statistics

American Psychiatric Association. 2018. "Stress: statistics". https://www.mentalhealth.org.uk/explore-mental-health/statistics/stress-statistics. Accessed: 2025-03-08

work page 2018

[4] [4]

bad” days are more biased than memories of “good

Charles S Areni and Mitchell Burger. 2008. Memories of “bad” days are more biased than memories of “good” days: Past Saturdays vary, but past Mondays are always blue.Journal of Applied Social Psychology38, 6 (2008), 1395–1415

work page 2008

[5] [5]

Raiyan Abdul Baten, Yozen Liu, Heinrich Peters, Francesco Barbieri, Neil Shah, Leonardo Neves, and Maarten W Bos. 2023. Predicting future location categories of users in a large social platform. InProceedings of the International AAAI Conference on Web and Social Media, Vol. 17. 47–58

work page 2023

[6] [6]

Irene Bonafonte, Cristina Bustos, Abraham Larrazolo, Gilberto Lorenzo Martínez Luna, Adolfo Guzmán Arenas, Xavier Baró, Isaac Tourgeman, Mercedes Balcells, and Agata Lapedriza. 2023. Analyzing the contribution of different passively collected data to predict Stress and Depression. In 2023 11th International Conference on Affective Computing and Intelligen...

work page 2023

[7] [7]

Leo Breiman. 2001. Random forests.Machine learning45, 1 (2001), 5–32

work page 2001

[8] [8]

Chawla, Kevin W

Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique.J. Artif. Int. Res.16, 1 (June 2002), 321–357

work page 2002

[9] [9]

Jiawei Chen, Janusz Konrad, and Prakash Ishwar. 2018. Vgan-based image representation learning for privacy-preserving facial expression recognition. InProceedings of the IEEE conference on computer vision and pattern recognition workshops. 1570–1579. Manuscript submitted to ACM 18 Hoang Khang Phan and Nhat Tan Le

work page 2018

[10] [10]

Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16). ACM, 785–794. https://doi.org/10.1145/2939672.2939785

work page doi:10.1145/2939672.2939785 2016

[11] [11]

Alex W DaSilva, Jeremy F Huckins, Rui Wang, Weichen Wang, Dylan D Wagner, and Andrew T Campbell. 2019. sch.JMIR mHealth and uHealth7, 3 (2019), e12084

work page 2019

[12] [12]

Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. InProceedings of the Second International Conference on Knowledge Discovery and Data Mining(Portland, Oregon)(KDD’96). AAAI Press, 226–231

work page 1996

[13] [13]

Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, and Michael Moeller. 2020. Inverting gradients-how easy is it to break privacy in federated learning?Advances in neural information processing systems33 (2020), 16937–16947

work page 2020

[14] [14]

Anoushka Harit, Zhongtian Sun, Jongmin Yu, and Noura Al Moubayed. 2024. Monitoring Behavioral Changes Using Spatiotemporal Graphs: A Case Study on the StudentLife Dataset. InNeurIPS 2024 Workshop on Behavioral Machine Learning. https://openreview.net/forum?id=WyAKFNSYl2

work page 2024

[15] [15]

Samiul Hasan, Xianyuan Zhan, and Satish V Ukkusuri. 2013. Understanding urban human activity and mobility patterns using large-scale location-based data from online social media. InProceedings of the 2nd ACM SIGKDD international workshop on urban computing. 1–8

work page 2013

[16] [16]

Nutthaporn Junsomboon and Tanasanee Phienthrakul. 2017. Combining Over-Sampling and Under-Sampling Techniques for Imbalance Dataset. In Proceedings of the 9th International Conference on Machine Learning and Computing(Singapore, Singapore)(ICMLC ’17). Association for Computing Machinery, New York, NY, USA, 243–247. https://doi.org/10.1145/3055635.3056643

work page doi:10.1145/3055635.3056643 2017

[17] [17]

Lin Sze Khoo, Mei Kuan Lim, Chun Yong Chong, and Roisin McNaney. 2024. Machine learning for multimodal mental health detection: a systematic review of passive sensing approaches.Sensors24, 2 (2024), 348

work page 2024

[18] [18]

Mika Koivisto and Simone Grassini. 2023. Mental imagery of nature induces positive psychological effects.Current Psychology42, 34 (2023), 30348–30363

work page 2023

[19] [19]

Bo Liu, Wanlei Zhou, Tianqing Zhu, Longxiang Gao, and Yong Xiang. 2018. Location privacy and its applications: A systematic study.IEEE access6 (2018), 17606–17624

work page 2018

[20] [20]

Yifan Liu, Chenchen Kuai, Xishun Liao, Haoxuan Ma, Brian Yueshuai He, and Jiaqi Ma. 2024. Semantic trajectory data mining with llm-informed poi classification. In2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 207–213

work page 2024

[21] [21]

Yunfei Luo, Iman Deznabi, Abhinav Shaw, Natcha Simsiri, Tauhidur Rahman, and Madalina Fiterau. 2024. Dynamic clustering via branched deep learning enhances personalization of stress prediction from mobile sensor data.Scientific Reports14, 1 (2024), 6631

work page 2024

[22] [22]

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. InArtificial intelligence and statistics. PMLR, 1273–1282

work page 2017

[23] [23]

Eric A Morris and Erick Guerra. 2015. Are we there yet? Trip duration and mood during travel.Transportation research part F: traffic psychology and behaviour33 (2015), 38–47

work page 2015

[24] [24]

HUCKINS, COURTNEY ROGERS, MEGHAN L

Subigya Nepal, Wenjun Liu, Arvind Pillai, Weichen Wang, Vlado Vojdanovski, Jeremy F. Huckins, Courtney Rogers, Meghan L. Meyer, and Andrew T. Campbell. 2024. Capturing the College Experience: A Four-Year Mobile Sensing Study of Mental Health, Resilience and Behavior of College Students during the Pandemic.Proc. ACM Interact. Mob. Wearable Ubiquitous Techn...

work page doi:10.1145/3643501 2024

[25] [25]

Agnieszka Olszewska-Guizzo, Angelia Sia, Anna Fogel, and Roger Ho. 2022. Features of urban green spaces associated with positive emotions, mindfulness and relaxation.Scientific reports12, 1 (2022), 20695

work page 2022

[26] [26]

OpenStreetMap contributors. 2017. Planet dump retrieved from https://planet.osm.org . https://www.openstreetmap.org

work page 2017

[27] [27]

Katarzyna Siła-Nowicka, Jan Vandrol, Taylor Oshan, Jed A Long, Urška Demšar, and A Stewart Fotheringham. 2016. Analysis of human mobility patterns from GPS trajectories and contextual information.International Journal of Geographical Information Science30, 5 (2016), 881–906

work page 2016

[28] [28]

Samuel Sousa and Roman Kern. 2023. How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processing.Artificial Intelligence Review56, 2 (2023), 1427–1492

work page 2023

[29] [29]

Arthur A Stone, Stefan Schneider, and James K Harter. 2012. Day-of-week mood patterns in the United States: On the existence of ‘Blue Monday’, ‘Thank God it’s Friday’and weekend effects.The Journal of Positive Psychology7, 4 (2012), 306–314

work page 2012

[30] [30]

Gemini Team. 2024. Gemini: A Family of Highly Capable Multimodal Models. arXiv:2312.11805 [cs.CL] https://arxiv.org/abs/2312.11805

work page internal anchor Pith review Pith/arXiv arXiv 2024

[31] [31]

Workplace Stress

The American Institute of Stress. 2018. "Workplace Stress". "https://www.mentalhealth.org.uk/explore-mental-health/statistics/stress-statistics". Accessed: 2025-03-08

work page 2018

[32] [32]

transformingeducation.org. 2024. Student Stress Statistics [2024 Update]. https://transformingeducation.org/student-stress-statistics/ Accessed: 2025-03-08

work page 2024

[33] [33]

Julio Vega, Meng Li, Kwesi Aguillera, Nikunj Goel, Echhit Joshi, Kirtiraj Khandekar, Krina C Durica, Abhineeth R Kunta, and Carissa A Low. 2021. Reproducible analysis pipeline for data streams: open-source software to process data collected with mobile devices.Frontiers in digital health3 (2021), 769823

work page 2021

[34] [34]

Campbell

Rui Wang, Fanglin Chen, Zhenyu Chen, Tianxing Li, Gabriella Harari, Stefanie Tignor, Xia Zhou, Dror Ben-Zeev, and Andrew T. Campbell. 2014. StudentLife: assessing mental health, academic performance and behavioral trends of college students using smartphones. InProceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computin...

work page doi:10.1145/2632048.2632054 2014

[35] [35]

Weichen Wang, Subigya Nepal, Jeremy F. Huckins, Lessley Hernandez, Vlado Vojdanovski, Dante Mack, Jane Plomp, Arvind Pillai, Mikio Obuchi, Alex daSilva, Eilis Murphy, Elin Hedlund, Courtney Rogers, Meghan Meyer, and Andrew Campbell. 2022. First-Gen Lens: Assessing Mental Health Manuscript submitted to ACM Quantifying the Privacy-Utility Trade-off in GPS-b...

work page doi:10.1145/3543194 2022

[36] [36]

Yonghui Xiao and Li Xiong. 2015. Protecting locations with differential privacy under temporal correlations. InProceedings of the 22nd ACM SIGSAC conference on computer and communications security. 1298–1309

work page 2015

[37] [37]

Kuehn, Jeremy F

Xuhai Xu, Xin Liu, Han Zhang, Weichen Wang, Subigya Nepal, Yasaman Sefidgar, Woosuk Seo, Kevin S. Kuehn, Jeremy F. Huckins, Margaret E. Morris, Paula S. Nurius, Eve A. Riskin, Shwetak Patel, Tim Althoff, Andrew Campbell, Anind K. Dey, and Jennifer Mankoff. 2023. GLOBEM: Cross-Dataset Generalization of Longitudinal Human Behavior Modeling.Proc. ACM Interac...

work page doi:10.1145/3569485 2023

[38] [38]

Abbas Yazdinejad, Ali Dehghantanha, Hadis Karimipour, Gautam Srivastava, and Reza M Parizi. 2024. A robust privacy-preserving federated learning model against model poisoning attacks.IEEE Transactions on Information Forensics and Security19 (2024), 6693–6708

work page 2024

[39] [39]

Ling Yin, Qian Wang, Shih-Lung Shaw, Zhixiang Fang, Jinxing Hu, Ye Tao, and Wei Wang. 2015. Re-identification risk versus data utility for aggregated mobility research using mobile phone location data.PloS one10, 10 (2015), e0140589

work page 2015

[40] [40]

Ligeng Zhu, Zhijian Liu, and Song Han. 2019. Deep leakage from gradients.Advances in neural information processing systems32 (2019). A Ablation studies A.1 Ablation Study Experiments To isolate the predictive power of different feature groups, we designed an ablation study. In addition to the full PA (Privacy-Aware) and AF (All-Features) models, we evalua...

work page 2019