pith. machine review for the scientific record. sign in

arxiv: 2604.06204 · v1 · submitted 2026-03-15 · 💻 cs.CL · cs.AI· cs.HC

Recognition: no theorem link

SensorPersona: An LLM-Empowered System for Continual Persona Extraction from Longitudinal Mobile Sensor Streams

Authors on Pith no claims yet

Pith reviewed 2026-05-15 11:41 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.HC
keywords persona extractionLLM agentsmobile sensorslongitudinal datacontinual inferencepersonalizationsensor streams
0
0 comments X

The pith

SensorPersona extracts stable user personas from ongoing mobile sensor streams using LLMs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SensorPersona as a system that continually infers user personas from multimodal longitudinal sensor data collected unobtrusively on mobile devices. It moves beyond chat-history methods by encoding sensor contexts, applying hierarchical reasoning across episodes to capture physical patterns alongside psychosocial traits and life experiences, and using incremental verification to update profiles over time. Evaluation on 1,580 hours of data from 20 participants shows up to 31.4 percent higher recall in persona extraction and an 85.7 percent win rate for resulting agent responses. A sympathetic reader would care because more accurate, behavior-grounded personas could let LLM agents respond with greater relevance and consistency without requiring explicit user disclosures.

Core claim

SensorPersona first performs person-oriented context encoding on continuous sensor streams, then employs hierarchical persona reasoning that integrates intra- and inter-episode analysis to infer personas spanning physical patterns, psychosocial traits, and life experiences, and finally applies clustering-aware incremental verification together with temporal evidence-aware updating to adapt to evolving personas.

What carries the argument

Hierarchical persona reasoning that combines intra- and inter-episode analysis on sensor-derived contexts, supported by clustering-aware incremental verification and temporal updating.

If this is right

  • LLM-based agents achieve up to 31.4 percent higher recall when extracting personas from sensor streams rather than chat logs.
  • Persona-aware agent responses win 85.7 percent of head-to-head comparisons against baselines.
  • User satisfaction rises measurably when agents draw on sensor-inferred traits and experiences.
  • Personas remain stable and updatable across months of data collected in varied locations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Passive sensor-based profiling could support applications such as long-term behavior monitoring or adaptive interfaces without requiring active user input.
  • Combining sensor personas with occasional self-reports might reduce inference errors for traits that sensors capture weakly.
  • Widespread use would require safeguards for data privacy and consent because the system runs continuously on personal devices.

Load-bearing premise

Continuous multimodal sensor streams from mobile devices contain sufficient reliable signals to allow accurate inference of stable personas that include psychosocial traits and life experiences without substantial noise or bias.

What would settle it

A study that compares extracted personas against independently verified ground-truth profiles while deliberately adding realistic sensor noise or restricting data to short time windows to check whether the reported recall and win-rate gains disappear.

Figures

Figures reproduced from arXiv: 2604.06204 by Bufang Yang, Kaiwei Liu, Lilin Xu, Xiaofan Jiang, Yixuan Li, Zhenyu Yan.

Figure 1
Figure 1. Figure 1: Application scenario of SensorPersona. view of users’ personas. In contrast, human personas are also reflected in users’ continuously evolving behaviors and interactions in the physical world rather than solely in con￾versational self-disclosures [7, 17]. This highlights the impor￾tance of developing LLM-based systems that infer personas from longitudinal sensor streams. Existing studies on sensor data und… view at source ↗
Figure 4
Figure 4. Figure 4: Performance of existing approaches for per [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Evolving personas over time. 3.2.3 Evolving Personas Over Time. Unlike short-term sen￾sory data understanding, personas represent relatively per￾sistent user characteristics that evolve over time [14, 39]. When personas are inferred from longitudinal streaming sen￾sor data, new personas may continuously emerge as more data arrives. As users’ habits, environments, and behavioral patterns change over time, p… view at source ↗
Figure 6
Figure 6. Figure 6: System overview of SensorPersona. from episodic events. Specifically, each persona 𝑝 ∈ P is represented as 𝑝 = (𝑑𝑝, E𝑝 ), where 𝑑𝑝 denotes the persona description and E𝑝 records the episodes supporting persona 𝑝. As new sensor data arrive over time, the system incre￾mentally updates P to capture persistent behavioral patterns while updating personas as they evolve. 4.2 System Overview SensorPersona is an L… view at source ↗
Figure 7
Figure 7. Figure 7: Semantic similarity heatmap of multimodal [PITH_FULL_IMAGE:figures/full_fig_p005_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: SensorPersona derives personas from sensor [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Hierarchical persona maintenance in Sensor [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Overall performance of persona extraction. [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Overall performance on the persona-aware [PITH_FULL_IMAGE:figures/full_fig_p010_11.png] view at source ↗
Figure 14
Figure 14. Figure 14: Impact of context compression. by SensorPersona enables the chatbot to generate more per￾sonalized responses. In contrast, relying solely on dialogue history leads the chatbot to produce generic plans that pro￾vide limited actionable guidance for the user’s query. 5.5 Effectiveness of System Module 5.5.1 Impact of Context Compression. We first evaluate the performance of incremental semantic context compr… view at source ↗
Figure 15
Figure 15. Figure 15: Impact of multi-dimensional personas. “w/o [PITH_FULL_IMAGE:figures/full_fig_p011_15.png] view at source ↗
Figure 18
Figure 18. Figure 18: Changes in per￾sona weight over time. iP14 Pixel7 iP8 iP15 0 1 2 3 Battery Drain (%/h) Idle SensorPersona [PITH_FULL_IMAGE:figures/full_fig_p011_18.png] view at source ↗
Figure 20
Figure 20. Figure 20: Token consumption and cost. Phy Ep and Psy Ep are physical and psychosocial episode construction, respectively. Cost is computed using GPT-4.1 pricing. 0.1 0.3 0.5 0.7 0.9 Threshold 0.3 0.5 0.7 0.9 Recall Recall Tokens 10 15 20 T o k e n s (1 0 3 ) [PITH_FULL_IMAGE:figures/full_fig_p011_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Impact of 𝛼. 3 6 8 12 24 36 72 Window length T (hours) 0 2 4 6 Tokens (10³) 0 25 50 75 Recall (%) Tokens Consumption Recall [PITH_FULL_IMAGE:figures/full_fig_p011_21.png] view at source ↗
Figure 23
Figure 23. Figure 23: demonstrates that SensorPersona consistently out￾performs the baselines across different dimensions of per￾sona extraction on both tasks. We also analyze ratings from participants who used SensorPersona for more than 100 hours, totaling 8 participants. Fig. 25b and Fig. 26b further show that when users collect more than 100 hours of sensing data, the average rating of SensorPersona increases from 4.35 to … view at source ↗
Figure 24
Figure 24. Figure 24: shows a comparison between SensorPersona and the baseline in both objective metrics and human ratings. Results illustrate that some participants with relatively low Accuracy Stability Coverage Specificity Clarity 1 2 3 4 5 SensorPersona ContextLLM (a) Persona extraction. Personalization Helpfulness Relevance Actionability Satisfaction 1 2 3 4 5 SensorPersona No persona ContextLLM (b) Agent response [PITH… view at source ↗
Figure 25
Figure 25. Figure 25: Persona extraction results. Personalization Helpfulness Relevance Actionability Satisfaction 1 2 3 4 5 SensorPersona ContextLLM (a) All users Personalization Helpfulness Relevance Actionability Satisfaction 1 2 3 4 5 SensorPersona ContextLLM (b) Users with >100h data [PITH_FULL_IMAGE:figures/full_fig_p016_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: Agent response results [PITH_FULL_IMAGE:figures/full_fig_p016_26.png] view at source ↗
Figure 28
Figure 28. Figure 28: An example of persona clustering in SensorPersona, where personas with similar patterns are grouped to [PITH_FULL_IMAGE:figures/full_fig_p017_28.png] view at source ↗
read the original abstract

Personalization is essential for Large Language Model (LLM)-based agents to adapt to users' preferences and improve response quality and task performance. However, most existing approaches infer personas from chat histories, which capture only self-disclosed information rather than users' everyday behaviors in the physical world, limiting the ability to infer comprehensive user personas. In this work, we introduce SensorPersona, an LLM-empowered system that continuously infers stable user personas from multimodal longitudinal sensor streams unobtrusively collected from users' mobile devices. SensorPersona first performs person-oriented context encoding on continuous sensor streams to enrich the semantics of sensor contexts. It then employs hierarchical persona reasoning that integrates intra- and inter-episode reasoning to infer personas spanning physical patterns, psychosocial traits, and life experiences. Finally, it employs clustering-aware incremental verification and temporal evidence-aware updating to adapt to evolving personas. We evaluate SensorPersona on a self-collected dataset containing 1,580 hours of sensor data from 20 participants, collected over up to 3 months across 17 cities on 3 continents. Results show that SensorPersona achieves up to 31.4% higher recall in persona extraction, an 85.7% win rate in persona-aware agent responses, and notable improvements in user satisfaction compared to state-of-the-art baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The paper introduces SensorPersona, an LLM-based system for continual persona extraction from longitudinal multimodal mobile sensor streams (GPS, accelerometer, app logs, etc.). It proposes person-oriented context encoding, hierarchical intra- and inter-episode persona reasoning to infer physical patterns, psychosocial traits, and life experiences, plus clustering-aware incremental verification and temporal updating for adaptation. Evaluation on a self-collected 20-participant, 1,580-hour dataset collected over up to 3 months reports up to 31.4% higher recall in persona extraction, an 85.7% win rate in persona-aware agent responses, and improved user satisfaction versus state-of-the-art baselines.

Significance. If the performance claims are substantiated with rigorous validation, the work would advance personalized LLM agents by demonstrating extraction of stable, comprehensive personas from unobtrusive real-world sensor data rather than chat histories alone. The continual adaptation mechanisms address an important practical gap. However, the current evaluation provides insufficient detail on metrics, baselines, and ground truth to support these claims at the reported level.

major comments (3)
  1. [Evaluation section] Evaluation section: The manuscript reports quantitative gains (31.4% recall lift, 85.7% win rate) on a self-collected dataset but provides no definition of the recall metric for persona extraction (e.g., how true positives are determined for psychosocial traits), no implementation details or hyperparameters for the state-of-the-art baselines, no statistical significance tests, and no controls for confounds such as prompt sensitivity or dataset collection biases. This leaves the central performance claims weakly supported.
  2. [Dataset and ground-truth description] Dataset and ground-truth description (likely §4.1): The 20-user, 1,580-hour corpus is described as containing sensor streams across 17 cities, but the paper does not report an independent validation process (e.g., validated questionnaires, blinded expert labeling of raw streams, or inter-rater reliability) for the inferred psychosocial traits and life experiences. Without such external ground truth, the recall metric risks measuring consistency with LLM priors rather than recovery of signals present in the noisy, indirect sensor data.
  3. [§3.3 Clustering-aware incremental verification] §3.3 Clustering-aware incremental verification: The temporal evidence-aware updating mechanism is presented as enabling adaptation to evolving personas, yet no ablation is shown isolating its contribution versus simpler recency-based updates, and no analysis addresses how sensor noise or missing data periods affect persona stability over the multi-month collection window.
minor comments (3)
  1. [Abstract and §1] The abstract and introduction use the term 'stable personas' without clarifying the time scale over which stability is measured or how drift is quantified.
  2. [Figure 2] Figure 2 (system overview) would benefit from explicit annotation of the input sensor modalities and the output persona representation format.
  3. [Related Work] Missing references to prior work on sensor-based personality inference (e.g., from mobile sensing literature) and LLM-based persona modeling would strengthen the related-work section.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We will revise the manuscript to strengthen the evaluation details, ground-truth description, and analysis as outlined below.

read point-by-point responses
  1. Referee: [Evaluation section] The manuscript reports quantitative gains (31.4% recall lift, 85.7% win rate) on a self-collected dataset but provides no definition of the recall metric for persona extraction (e.g., how true positives are determined for psychosocial traits), no implementation details or hyperparameters for the state-of-the-art baselines, no statistical significance tests, and no controls for confounds such as prompt sensitivity or dataset collection biases. This leaves the central performance claims weakly supported.

    Authors: We agree these details are needed to support the claims. In the revision we will add: a precise definition of recall (true positives determined via participant self-confirmation in post-study surveys for each trait category); full hyperparameters and code-level implementation details for all baselines; paired statistical significance tests with p-values; and prompt-sensitivity controls by averaging results over 5 prompt variants with reported variance. Dataset collection biases will also be discussed. revision: yes

  2. Referee: [Dataset and ground-truth description] The 20-user, 1,580-hour corpus is described as containing sensor streams across 17 cities, but the paper does not report an independent validation process (e.g., validated questionnaires, blinded expert labeling of raw streams, or inter-rater reliability) for the inferred psychosocial traits and life experiences. Without such external ground truth, the recall metric risks measuring consistency with LLM priors rather than recovery of signals present in the noisy, indirect sensor data.

    Authors: We acknowledge the absence of blinded expert labeling in the original submission. Ground truth was obtained via participant self-confirmation questionnaires administered after data collection. In the revision we will explicitly describe this process, add a dedicated limitations paragraph on self-report biases, and report inter-rater reliability (Cohen's kappa) on a 5-user subset labeled by two independent annotators. This will clarify that recall is anchored to user-validated signals rather than LLM priors alone. revision: partial

  3. Referee: [§3.3 Clustering-aware incremental verification] The temporal evidence-aware updating mechanism is presented as enabling adaptation to evolving personas, yet no ablation is shown isolating its contribution versus simpler recency-based updates, and no analysis addresses how sensor noise or missing data periods affect persona stability over the multi-month collection window.

    Authors: We agree an ablation is warranted. The revised manuscript will include a new ablation table comparing the full clustering-aware incremental verification against a recency-only update baseline, quantifying the incremental gains in recall and stability. We will also add an analysis of persona stability by segmenting the timeline into high-noise/missing-data periods (using sensor quality flags) and reporting drift metrics across the 3-month window. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation or evaluation chain.

full rationale

The paper describes an LLM-based pipeline (context encoding, hierarchical reasoning, incremental verification) evaluated via comparative recall and win-rate metrics against external baselines on a held-out self-collected dataset. No equations, fitted parameters, or self-citations are invoked that reduce any central claim to a definition or input by construction. The reported improvements are independent comparative results rather than self-referential reductions.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that LLMs can reliably perform persona inference from encoded sensor contexts and that personas remain stable enough for incremental updating.

axioms (2)
  • domain assumption LLMs can accurately infer psychosocial traits and life experiences from person-oriented encodings of sensor data
    Invoked in the hierarchical persona reasoning stage.
  • domain assumption User personas extracted from sensor streams are sufficiently stable to support continual updating without frequent contradictions
    Required for the temporal evidence-aware updating component.

pith-pipeline@v0.9.0 · 5553 in / 1312 out tokens · 43987 ms · 2026-05-15T11:41:54.433558+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

78 extracted references · 78 canonical work pages · 6 internal anchors

  1. [1]

    Resemblyzer

    2019. Resemblyzer. https://github.com/resemble-ai/Resemblyzer

  2. [2]

    SenseVoice

    2025. SenseVoice. https://github.com/FunAudioLLM/SenseVoice

  3. [3]

    Claude Opus Model

    2026. Claude Opus Model. https://www.anthropic.com/claude/opus

  4. [4]

    Google Gemini

    2026. Google Gemini. https://gemini.google.com/

  5. [5]

    OpenAI Models

    2026. OpenAI Models. https://developers.openai.com/api/docs/ models/

  6. [6]

    OpenClaw: The AI that actually does things

    2026. OpenClaw: The AI that actually does things. https://openclaw. ai/

  7. [7]

    American Psychological Association. [n. d.]. Personality. https://www. apa.org/topics/personality. Accessed: 2026-02-05

  8. [8]

    Hongru Cai, Yongqi Li, Wenjie Wang, Fengbin Zhu, Xiaoyu Shen, Wen- jie Li, and Tat-Seng Chua. 2025. Large language models empowered personalized web agents. InProceedings of the ACM on Web Conference

  9. [9]

    Ding Chen, Simin Niu, Kehang Li, Peng Liu, Xiangping Zheng, Bo Tang, Xinchi Li, Feiyu Xiong, and Zhiyu Li. 2025. Halumem: Eval- uating hallucinations in memory systems of agents.arXiv preprint arXiv:2511.03506(2025)

  10. [10]

    Prateek Chhikara, Dev Khant, Saket Aryan, Taranjeet Singh, and Deshraj Yadav. 2025. Mem0: Building production-ready ai agents with scalable long-term memory.arXiv preprint arXiv:2504.19413(2025)

  11. [11]

    Akshat Choube, Ha Le, Jiachen Li, Kaixin Ji, Vedant Das Swain, and Varun Mishra. 2025. GLOSS: Group of LLMs for Open-Ended Sense- making of Passive Sensing Data for Health and Wellbeing.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 9, 3 (2025), 1–32

  12. [12]

    Bangde Du, Minghao Guo, Songming He, Ziyi Ye, Xi Zhu, Weihang Su, Shuqi Zhu, Yujia Zhou, Yongfeng Zhang, Qingyao Ai, et al. 2025. TwinVoice: A Multi-dimensional Benchmark Towards Digital Twins via LLM Persona Simulation.arXiv preprint arXiv:2510.25536(2025)

  13. [13]

    Katayoun Farrahi and Daniel Gatica-Perez. 2008. What did you do today? Discovering daily routines from large-scale mobile data. In Proceedings of the 16th ACM international conference on Multimedia. 849–852

  14. [14]

    William Fleeson. 2001. Toward a structure-and process-integrated view of personality: Traits as density distributions of states.Journal of personality and social psychology80, 6 (2001), 1011

  15. [15]

    Ge Gao, Alexey Taymanov, Eduardo Salinas, Paul Mineiro, and Dipen- dra Misra. 2024. Aligning llm agents by learning latent preference from user edits.Advances in Neural Information Processing Systems37 (2024), 136873–136896

  16. [16]

    Tao Ge, Xin Chan, Xiaoyang Wang, Dian Yu, Haitao Mi, and Dong Yu. 2024. Scaling synthetic data creation with 1,000,000,000 personas. arXiv preprint arXiv:2406.20094(2024)

  17. [17]

    Peter Haehner, Amanda Jo Wright, and Wiebke Bleidorn. 2024. A systematic review of volitional personality change research.Commu- nications Psychology2, 1 (2024), 115

  18. [18]

    Jiaming Han, Kaixiong Gong, Yiyuan Zhang, Jiaqi Wang, Kaipeng Zhang, Dahua Lin, Yu Qiao, Peng Gao, and Xiangyu Yue. 2024. Onellm: One framework to align all modalities with language. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 26584–26595

  19. [19]

    Lixing He, Bufang Yang, Di Duan, Zhenyu Yan, and Guoliang Xing

  20. [20]

    Embodiedsense: Understanding embodied activities with ear- phones.arXiv e-prints(2025), arXiv–2504

  21. [21]

    Bowen Jiang, Zhuoqun Hao, Young-Min Cho, Bryan Li, Yuan Yuan, Sihao Chen, Lyle Ungar, Camillo J Taylor, and Dan Roth. 2025. Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale.arXiv preprint arXiv:2504.14225 (2025)

  22. [22]

    Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang, and Lili Qiu

  23. [23]

    InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

    LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 13358–13376

  24. [24]

    Huiqiang Jiang, Qianhui Wu, Xufang Luo, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, and Lili Qiu. 2024. Longllmlingua: Accelerating and enhancing llms in long context scenarios via prompt compres- sion. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1658–1677

  25. [25]

    Jiachen Li, Xiwen Li, Justin Steinberg, Akshat Choube, Bingsheng Yao, Xuhai Xu, Dakuo Wang, Elizabeth Mynatt, and Varun Mishra. 2025. Vital Insight: Assisting Experts’ Context-Driven Sensemaking of Multi- modal Personal Tracking Data Using Visualization and Human-in-the- Loop LLM.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Techn...

  26. [26]

    Jiaqi Liu, Yaofeng Su, Peng Xia, Siwei Han, Zeyu Zheng, Cihang Xie, Mingyu Ding, and Huaxiu Yao. 2026. SimpleMem: Efficient Lifelong Memory for LLM Agents.arXiv preprint arXiv:2601.02553(2026)

  27. [27]

    Kaiwei Liu, Bufang Yang, Lilin Xu, Yunqi Guo, Neiwen Ling, Zhihe Zhao, Guoliang Xing, Xian Shuai, Xiaozhe Ren, Xin Jiang, et al. 2024. Tasking heterogeneous sensor systems with LLMs. InProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems. 901– 902

  28. [28]

    Kaiwei Liu, Bufang Yang, Lilin Xu, Yunqi Guo, Guoliang Xing, Xian Shuai, Xiaozhe Ren, Xin Jiang, and Zhenyu Yan. 2025. TaskSense: A Translation-like Approach for Tasking Heterogeneous Sensor Systems with LLMs. InProceedings of the 23rd ACM Conference on Embedded Networked Sensor Systems. 213–225

  29. [29]

    Lin Long, Yichen He, Wentao Ye, Yiyuan Pan, Yuan Lin, Hang Li, Junbo Zhao, and Wei Li. 2025. Seeing, listening, remembering, and reasoning: A multimodal agent with long-term memory.arXiv preprint arXiv:2508.09736(2025)

  30. [30]

    Hong Lu, Denise Frauendorfer, Mashfiqui Rabbi, Marianne Schmid Mast, Gokul T Chittaranjan, Andrew T Campbell, Daniel Gatica-Perez, and Tanzeem Choudhury. 2012. Stresssense: Detecting stress in un- constrained acoustic environments using smartphones. InProceedings of the 2012 ACM conference on ubiquitous computing. 351–360

  31. [31]

    OpenAI. 2023. ChatGPT. https://chat.openai.com

  32. [32]

    Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wain- wright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al . 2022. Training language models to follow instructions with human feedback.Advances in neural information processing systems35 (2022), 27730–27744

  33. [33]

    Xiaomin Ouyang and Mani Srivastava. 2024. LLMSense: Harnessing LLMs for high-level reasoning over spatiotemporal sensor traces.arXiv preprint arXiv:2403.19857(2024)

  34. [34]

    Daniel J Ozer and Veronica Benet-Martinez. 2006. Personality and the prediction of consequential outcomes.Annu. Rev. Psychol.57, 1 (2006), 401–421

  35. [35]

    Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Menglin Xia, Xufang Luo, Jue Zhang, Qingwei Lin, Victor Rühle, Yuqing Yang, Chin-Yew Lin, et al. 2024. LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression. InACL (Findings). 13

  36. [36]

    Kevin Post, Reo Kuchida, Mayowa Olapade, Zhigang Yin, Petteri Nurmi, and Huber Flores. 2025. Contextllm: Meaningful context reasoning from multi-sensor and multi-device data using llms. InProceedings of the 26th International Workshop on Mobile Computing Systems and Applications. 13–18

  37. [37]

    Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn. 2023. Direct preference optimization: Your language model is secretly a reward model.Advances in neural information processing systems36 (2023), 53728–53741

  38. [38]

    Jerome Ramos, Hossein A Rahmani, Xi Wang, Xiao Fu, and Aldo Li- pani. 2024. Transparent and scrutable recommendations using natural language user profiles. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 13971–13984

  39. [39]

    Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence em- beddings using siamese bert-networks.arXiv preprint arXiv:1908.10084 (2019)

  40. [40]

    Brent W Roberts and Daniel Mroczek. 2008. Personality trait change in adulthood.Current directions in psychological science17, 1 (2008), 31–35

  41. [41]

    Brent W Roberts, Kate E Walton, and Wolfgang Viechtbauer. 2006. Patterns of mean-level change in personality traits across the life course: a meta-analysis of longitudinal studies.Psychological bulletin 132, 1 (2006), 1

  42. [42]

    Vijay Srinivasan, Saeed Moghaddam, Abhishek Mukherji, Kiran K Rachuri, Chenren Xu, and Emmanuel Munguia Tapia. 2014. Mobilem- iner: Mining your frequent patterns on your phone. InProceedings of the 2014 ACM international joint conference on pervasive and ubiquitous computing. 389–400

  43. [43]

    Silero Team. 2024. Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier. https://github.com/snakers4/silero-vad

  44. [44]

    Ye Tian, Xiaoyuan Ren, Zihao Wang, Onat Gungor, Xiaofan Yu, and Tajana Rosing. 2025. DailyLLM: context-aware activity log generation using multi-modal sensors and LLMs. In2025 IEEE 22nd International Conference on Mobile Ad-Hoc and Smart Systems (MASS). IEEE, 372– 380

  45. [45]

    Junyang Wang, Haiyang Xu, Haitao Jia, Xi Zhang, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, and Jitao Sang. 2024. Mobile-agent-v2: Mobile device operation assistant with effective navigation via multi- agent collaboration.Advances in Neural Information Processing Systems 37 (2024), 2686–2710

  46. [46]

    Rui Wang, Fanglin Chen, Zhenyu Chen, Tianxing Li, Gabriella Harari, Stefanie Tignor, Xia Zhou, Dror Ben-Zeev, and Andrew T Campbell

  47. [47]

    InProceed- ings of the 2014 ACM international joint conference on pervasive and ubiquitous computing

    StudentLife: assessing mental health, academic performance and behavioral trends of college students using smartphones. InProceed- ings of the 2014 ACM international joint conference on pervasive and ubiquitous computing. 3–14

  48. [48]

    Rui Wang, Gabriella Harari, Peilin Hao, Xia Zhou, and Andrew T Campbell. 2015. SmartGPA: how smartphones can assess and pre- dict academic performance of college students. InProceedings of the 2015 ACM international joint conference on pervasive and ubiquitous computing. 295–306

  49. [49]

    Tiannan Wang, Meiling Tao, Ruoyu Fang, Huilin Wang, Shuai Wang, Yuchen Eleanor Jiang, and Wangchunshu Zhou. 2024. Ai persona: To- wards life-long personalization of llms.arXiv preprint arXiv:2412.13103 (2024)

  50. [50]

    Weichen Wang, Gabriella M Harari, Rui Wang, Sandrine R Müller, Shayan Mirjafari, Kizito Masaba, and Andrew T Campbell. 2018. Sens- ing behavioral change over time: Using within-person variability fea- tures from mobile sensing to predict personality traits.Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies2, 3 (2018), 1–21

  51. [51]

    Zhepei Wei, Wenlin Yao, Yao Liu, Weizhi Zhang, Qin Lu, Liang Qiu, Changlong Yu, Puyang Xu, Chao Zhang, Bing Yin, et al . 2025. Webagent-r1: Training web agents via end-to-end multi-turn rein- forcement learning. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 7920–7939

  52. [52]

    John T Wixted. 2004. The psychology and neuroscience of forgetting. Annu. Rev. Psychol.55, 1 (2004), 235–269

  53. [53]

    Wendy Wood and David T Neal. 2007. A new look at habits and the habit-goal interface.Psychological review114, 4 (2007), 843

  54. [54]

    Huatao Xu, Liying Han, Qirui Yang, Mo Li, and Mani Srivastava. 2024. Penetrative ai: Making llms comprehend the physical world. InProceed- ings of the 25th International Workshop on Mobile Computing Systems and Applications. 1–7

  55. [55]

    Huatao Xu, Zilin Zeng, Panrong Tong, Mo Li, and Mani B Srivastava

  56. [56]

    Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies9, 4 (2025), 1–29

    Autolife: Automatic life journaling with smartphones and llms. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies9, 4 (2025), 1–29

  57. [57]

    Wujiang Xu, Zujie Liang, Kai Mei, Hang Gao, Juntao Tan, and Yongfeng Zhang. 2025. A-mem: Agentic memory for llm agents.arXiv preprint arXiv:2502.12110(2025)

  58. [58]

    Xuhai Xu, Xin Liu, Han Zhang, Weichen Wang, Subigya Nepal, Yasaman Sefidgar, Woosuk Seo, Kevin S Kuehn, Jeremy F Huckins, Margaret E Morris, et al. 2023. Globem: Cross-dataset generalization of longitudinal human behavior modeling.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies6, 4 (2023), 1–34

  59. [59]

    Xuhai Xu, Han Zhang, Yasaman Sefidgar, Yiyi Ren, Xin Liu, Woosuk Seo, Jennifer Brown, Kevin Kuehn, Mike Merrill, Paula Nurius, et al

  60. [60]

    GLOBEM dataset: multi-year datasets for longitudinal human behavior modeling generalization.Advances in neural information processing systems35 (2022), 24655–24692

  61. [61]

    An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, Chujie Zheng, Dayiheng Liu, Fan Zhou, Fei Huang, Feng Hu, Hao Ge, Haoran Wei, Huan Lin, Jialong Tang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jing Zhou, Jingren Zhou, Junyang Lin, Kai Dang, Keqin Bao, Kexin Yang, ...

  62. [62]

    Qwen3 Technical Report.arXiv preprint arXiv:2505.09388(2025)

  63. [63]

    An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, Huan Lin, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jingren Zhou, Junyang Lin, Kai Dang, Keming Lu, Keqin Bao, Kexin Yang, Le Yu, Mei Li, Mingfeng Xue, Pei Zhang, Qin Zhu, Rui Men, Runji Lin, Tianhao Li, Tingyu X...

  64. [64]

    Bufang Yang, Yunqi Guo, Lilin Xu, Zhenyu Yan, Hongkai Chen, Guo- liang Xing, and Xiaofan Jiang. 2025. Socialmind: Llm-based proactive ar social assistive system with human-like perception for in-situ live interactions.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies9, 1 (2025), 1–30

  65. [65]

    Bufang Yang, Lixing He, Neiwen Ling, Zhenyu Yan, Guoliang Xing, Xian Shuai, Xiaozhe Ren, and Xin Jiang. 2023. Edgefm: Leveraging foundation model for open-set learning on the edge. InProceedings of the 21st ACM conference on embedded networked sensor systems. 14 111–124

  66. [66]

    Bufang Yang, Lixing He, Kaiwei Liu, and Zhenyu Yan. 2024. Viassist: Adapting multi-modal large language models for users with visual impairments. In2024 IEEE International Workshop on Foundation Models for Cyber-Physical Systems & Internet of Things (FMSys). IEEE, 32–37

  67. [67]

    Bufang Yang, Siyang Jiang, Lilin Xu, Kaiwei Liu, Hai Li, Guoliang Xing, Hongkai Chen, Xiaofan Jiang, and Zhenyu Yan. 2024. Drhouse: An llm-empowered diagnostic reasoning system through harnessing outcomes from sensor data and expert knowledge.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies8, 4 (2024), 1–29

  68. [68]

    Bufang Yang, Le Liu, Wenxuan Wu, Mengliang Zhou, Hongxing Liu, and Xinbao Ning. 2023. BrainZ-BP: A noninvasive cuff-less blood pressure estimation approach leveraging brain bio-impedance and electrocardiogram.IEEE Transactions on Instrumentation and Measure- ment73 (2023), 1–13

  69. [69]

    Bufang Yang, Wenrui Lu, Lixing He, Neiwen Ling, Zhenyu Yan, Guo- liang Xing, Xian Shuai, Xiaozhe Ren, and Xin Jiang. 2026. An Efficient Edge-Cloud Collaboration System with Foundational Models for Open- Set IoT Applications.IEEE Transactions on Mobile Computing(2026)

  70. [70]

    Bufang Yang, Wenxuan Wu, Yitian Liu, and Hongxing Liu. 2022. A novel sleep stage contextual refinement algorithm leveraging condi- tional random fields.IEEE Transactions on Instrumentation and Mea- surement71 (2022), 1–13

  71. [71]

    Bufang Yang, Lilin Xu, Liekang Zeng, Yunqi Guo, Siyang Jiang, Wen- rui Lu, Kaiwei Liu, Hancheng Xiang, Xiaofan Jiang, Guoliang Xing, et al. 2025. ProAgent: Harnessing On-Demand Sensory Contexts for Proactive LLM Agent Systems.arXiv preprint arXiv:2512.06721(2025)

  72. [72]

    Bufang Yang, Lilin Xu, Liekang Zeng, Kaiwei Liu, Siyang Jiang, Wenrui Lu, Hongkai Chen, Xiaofan Jiang, Guoliang Xing, and Zhenyu Yan

  73. [73]

    ContextAgent: Context-Aware Proactive LLM Agents with Open- World Sensory Perceptions.arXiv preprint arXiv:2505.14668(2025)

  74. [74]

    Xiaofan Yu, Lanxiang Hu, Benjamin Reichman, Dylan Chu, Rushil Chandrupatla, Xiyuan Zhang, Larry Heck, and Tajana S Rosing. 2025. Sensorchat: Answering qualitative and quantitative questions during long-term multimodal sensor interactions.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies9, 3 (2025), 1–35

  75. [75]

    Siyan Zhao, Mingyi Hong, Yang Liu, Devamanyu Hazarika, and Kaix- iang Lin. 2025. Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs. arXiv:2502.09597 [cs.LG] https://arxiv.org/abs/2502.09597

  76. [76]

    Yu Zheng, Quannan Li, Yukun Chen, Xing Xie, and Wei-Ying Ma. 2008. Understanding mobility based on GPS data. InProceedings of the 10th international conference on Ubiquitous computing. 312–321

  77. [77]

    Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, and Yanlin Wang

  78. [78]

    InProceedings of the AAAI Conference on Artificial Intelligence, Vol

    Memorybank: Enhancing large language models with long-term memory. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 19724–19731. 15 APPENDIX A DETAILS OF USER STUDY Accuracy Stability Coverage Specificity Clarity 1 2 3 4 5 SensorPersona ContextLLM (a) All users Accuracy Stability Coverage Specificity Clarity 1 2 3 4 5 SensorPerson...