Recognition: no theorem link
ProAgent: Harnessing On-Demand Sensory Contexts for Proactive LLM Agent Systems in the Wild
Pith reviewed 2026-05-17 00:34 UTC · model grok-4.3
The pith
ProAgent lets proactive LLM agents assist users in daily life by turning on detailed sensors only when low-cost cues indicate a need.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ProAgent is an end-to-end system that combines on-demand tiered perception, proactive-oriented context extraction, and a context-aware proactive reasoner to deliver continuous in-the-wild assistance while keeping system overhead low.
What carries the argument
on-demand tiered perception that pairs low-cost contextual cues with selective activation of richer sensory processing
If this is right
- Higher accuracy in predicting when a user needs help compared with always-on or fixed-context baselines.
- Fewer unnecessary activations that would drain resources or annoy the user.
- Support for continuous daily assistance rather than short task-specific episodes.
- Practical deployment on wearable hardware such as AR glasses without prohibitive overhead.
Where Pith is reading between the lines
- The same tiered approach could extend to other battery-limited devices like phones or earbuds where constant sensing is costly.
- User preference modeling inside the context extractor may need ongoing updates as habits change over weeks or months.
- If the low-cost cues prove too coarse in noisy environments, the system might require additional lightweight sensors to maintain reliability.
Load-bearing premise
Low-cost cues will reliably signal when richer perception is needed without missing important user needs or adding unacceptable delay.
What would settle it
Real-world trials in which the agent either fails to activate detailed sensing for a genuine user need or incurs noticeable latency before responding.
Figures
read the original abstract
Recent studies have begun to explore proactive large language model (LLM) agents that provide unobtrusive assistance by automatically leveraging contextual information, such as in code editing and in-app suggestions. However, most focus on short, task-specific episodes or on-screen contexts, rather than continuously perceiving and assisting users throughout daily life. Enabling such in-the-wild assistance requires continuous sensing of users' surroundings, which can incur substantial system overhead. In this work, we propose ProAgent, an end-to-end proactive agent system that harnesses on-demand sensory contexts to provide in-the-wild assistance. ProAgent first employs on-demand tiered perception to continuously sense users' surroundings by integrating low-cost contextual cues with richer perception on demand, and uses proactive-oriented context extraction to derive hierarchical contexts integrating both sensory contexts and human preferences. ProAgent then employs a context-aware proactive reasoner to infer user needs and invokes external tools to deliver proactive assistance. We implement ProAgent on AR glasses and evaluate it on a public dataset and a real-world dataset. Results demonstrate that ProAgent achieves up to 27.7% higher proactive prediction accuracy and 20.5% lower false detection than state-of-the-art baselines. A user study with 20 participants shows that 85% were satisfied with ProAgent and willing to use it in daily life.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes ProAgent, an end-to-end proactive LLM agent system for in-the-wild assistance that integrates on-demand tiered perception (low-cost contextual cues triggering richer sensing), proactive-oriented hierarchical context extraction incorporating sensory data and user preferences, and a context-aware reasoner that infers needs and invokes external tools. The system is implemented on AR glasses and evaluated on a public dataset plus a real-world dataset, reporting up to 27.7% higher proactive prediction accuracy and 20.5% lower false detection versus state-of-the-art baselines, plus 85% user satisfaction in a 20-participant study.
Significance. If the empirical gains hold under broader conditions, the work would meaningfully advance proactive LLM agents by showing how tiered, on-demand sensing can reduce continuous overhead while maintaining relevance for daily-life assistance. The combination of low-cost cues with hierarchical context and tool invocation provides a concrete architecture that could be extended to other wearable or ambient platforms.
major comments (2)
- [Evaluation section (results on public and real-world datasets)] The central claim of 27.7% higher proactive prediction accuracy and 20.5% lower false detection depends on the tiered perception pipeline correctly activating richer sensing only when low-cost cues indicate relevant needs. No ablation study or error analysis is provided that quantifies miss rates of the low-cost cues under atypical environments, preference shifts, or novel situations (see Evaluation section and results tables). Without this, it is unclear whether the reported gains generalize to continuous in-the-wild operation.
- [User study subsection] The user study with 20 participants reports 85% satisfaction and willingness to use the system, yet provides insufficient detail on experimental protocol, how latency or missed detections were assessed, and comparison to baselines. This weakens support for the claim of practical daily-life applicability.
minor comments (2)
- [System architecture / tiered perception description] Clarify the exact low-cost cues, perception thresholds, and activation logic in the tiered perception module; the current description leaves the decision criteria somewhat underspecified.
- [Abstract and Evaluation] In the abstract and results, state the precise datasets and conditions under which the maximum 27.7% accuracy gain is observed rather than reporting only the peak value.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address the two major points below and will revise the manuscript to strengthen the evaluation and user study sections.
read point-by-point responses
-
Referee: [Evaluation section (results on public and real-world datasets)] The central claim of 27.7% higher proactive prediction accuracy and 20.5% lower false detection depends on the tiered perception pipeline correctly activating richer sensing only when low-cost cues indicate relevant needs. No ablation study or error analysis is provided that quantifies miss rates of the low-cost cues under atypical environments, preference shifts, or novel situations (see Evaluation section and results tables). Without this, it is unclear whether the reported gains generalize to continuous in-the-wild operation.
Authors: We agree that quantifying the miss rates of the low-cost cues is important for demonstrating generalization. In the revised manuscript we will add an ablation study and error analysis subsection to the Evaluation section. This will report miss rates and failure cases of the tiered perception module on both the public and real-world datasets under atypical environments, preference shifts, and novel situations, together with an analysis of how these errors affect overall proactive prediction accuracy and false detections. revision: yes
-
Referee: [User study subsection] The user study with 20 participants reports 85% satisfaction and willingness to use the system, yet provides insufficient detail on experimental protocol, how latency or missed detections were assessed, and comparison to baselines. This weakens support for the claim of practical daily-life applicability.
Authors: We acknowledge that additional protocol details are needed. In the revised User Study subsection we will expand the description to include the full experimental protocol (participant recruitment, demographics, task scenarios, and session structure), the specific methods used to measure and log latency and missed detections, and direct comparisons of user satisfaction and perceived usefulness against the same baselines used in the quantitative evaluation. revision: yes
Circularity Check
No circularity: empirical system evaluation on external datasets
full rationale
The paper describes an implemented proactive agent system (ProAgent) using tiered perception and context extraction, then reports measured accuracy gains (up to 27.7% higher proactive prediction accuracy) from evaluation on a public dataset and a real-world dataset. No equations, first-principles derivations, fitted parameters renamed as predictions, or self-citation chains appear in the provided text. All central claims rest on external experimental results rather than reducing to the paper's own inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (1)
- tiered perception thresholds
axioms (1)
- domain assumption Contextual cues from low-cost sensors can reliably indicate need for detailed perception
Forward citations
Cited by 7 Pith papers
-
Pro$^2$Assist: Continuous Step-Aware Proactive Assistance with Multimodal Egocentric Perception for Long-Horizon Procedural Tasks
Pro²Assist uses multimodal egocentric perception from AR glasses to track fine-grained progress in long-horizon procedural tasks and deliver timely proactive assistance, outperforming baselines by over 21% in action u...
-
From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench
ProVoice-Bench is the first framework to evaluate proactive voice agents, revealing that state-of-the-art multimodal LLMs struggle with over-triggering and context-aware reasoning.
-
SensorPersona: An LLM-Empowered System for Continual Persona Extraction from Longitudinal Mobile Sensor Streams
SensorPersona uses LLMs for hierarchical reasoning on longitudinal mobile sensor streams to continually extract stable personas, showing up to 31.4% higher recall and 85.7% win rate over baselines on a 20-user dataset.
-
Agentic Coding Needs Proactivity, Not Just Autonomy
Coding agents require a three-level proactivity taxonomy (Reactive, Scheduled, Situation Aware) evaluated by insight policy quality using Insight Decision Quality, Context Grounding Score, and Learning Lift.
-
VisionClaw: Always-On AI Agents through Smart Glasses
VisionClaw couples continuous egocentric vision on smart glasses with speech-driven AI agents to enable hands-free real-world tasks, with lab and field studies showing faster completion and a shift toward opportunisti...
-
Position: Life-Logging Video Streams Make the Privacy-Utility Trade-off Inevitable
Life-logging video streams create an inevitable privacy-utility trade-off that is a foundational challenge for always-on AI systems.
-
PASK: Toward Intent-Aware Proactive Agents with Long-Term Memory
PASK introduces the DD-MM-PAS paradigm for streaming proactive agents with intent-aware detection, hybrid memory modeling, and a new real-world benchmark where the IntentFlow model matches top LLMs on latency while fi...
Reference graph
Works this paper leans on
-
[1]
4 ways Pixel’s Magic Cue can help you save time
2025. 4 ways Pixel’s Magic Cue can help you save time. https://blog. google/products/pixel/google-pixel-magic-cue-ai-feature/
work page 2025
- [2]
-
[3]
2025. Ollama. https://ollama.com/
work page 2025
- [4]
-
[5]
Use Fall Detection with Apple Watch
2025. Use Fall Detection with Apple Watch. https://support.apple.com/en-hk/108896
work page 2025
-
[6]
Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al . 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774(2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[7]
Shuai Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, Sibo Song, Kai Dang, Peng Wang, Shijie Wang, Jun Tang, Humen Zhong, Yuanzhi Zhu, Mingkun Yang, Zhaohai Li, Jianqiang Wan, Pengfei Wang, Wei Ding, Zheren Fu, Yiheng Xu, Jiabo Ye, Xi Zhang, Tianbao Xie, Zesen Cheng, Hang Zhang, Zhibo Yang, Haiyang Xu, and Junyang Lin. 2025. Qwen2.5-VL Technical Rep...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[8]
Kinjal Basu, Ibrahim Abdelaziz, Subhajit Chaudhury, Soham Dan, Maxwell Crouse, Asim Munawar, Vernon Austel, Sadhana Kumar- avel, Vinod Muthusamy, Pavan Kapanipathi, et al. 2024. API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs. InProceedings of the 62nd Annual Meeting of the Association for Com- putational Linguistics (Volume 1: L...
work page 2024
-
[9]
Peter Belcak, Greg Heinrich, Shizhe Diao, Yonggan Fu, Xin Dong, Saurav Muralidharan, Yingyan Celine Lin, and Pavlo Molchanov. 2025. Small Language Models are the Future of Agentic AI.arXiv preprint arXiv:2506.02153(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[10]
Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Muyan Zhong, Qinglong Zhang, Xizhou Zhu, Lewei Lu, et al . 2024. Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 24185–24198
work page 2024
-
[11]
Ye Cheng, Minghui Xu, Yue Zhang, Kun Li, Ruoxi Wang, and Lian Yang. 2024. AutoIoT: Automated IoT Platform Using Large Language Models.IEEE Internet of Things Journal(2024)
work page 2024
-
[12]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova
-
[13]
Bert: Pre-training of deep bidirectional transformers for lan- guage understanding. InProceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). 4171–4186
work page 2019
-
[14]
Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Jingyuan Ma, Rui Li, Heming Xia, Jingjing Xu, Zhiyong Wu, Tianyu Liu, et al . 2022. A survey on in-context learning.arXiv preprint arXiv:2301.00234(2022)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[15]
Zachary Englhardt, Richard Li, Dilini Nissanka, Zhihan Zhang, Girish Narayanswamy, Joseph Breda, Xin Liu, Shwetak Patel, and Vikram Iyer. 2024. Exploring and characterizing large language models for embedded system development and debugging. InExtended Abstracts of the CHI Conference on Human Factors in Computing Systems. 1–9
work page 2024
-
[16]
Yi Gao, Kaijie Xiao, Fu Li, Weifeng Xu, Jiaming Huang, and Wei Dong
-
[17]
ChatIoT: Zero-code Generation of Trigger-action Based IoT Programs.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies8, 3 (2024), 1–29
work page 2024
-
[18]
Albert Gu and Tri Dao. 2023. Mamba: Linear-time sequence modeling with selective state spaces.arXiv preprint arXiv:2312.00752(2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[19]
Jiaming Han, Kaixiong Gong, Yiyuan Zhang, Jiaqi Wang, Kaipeng Zhang, Dahua Lin, Yu Qiao, Peng Gao, and Xiangyu Yue. 2024. Onellm: One framework to align all modalities with language. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 26584–26595
work page 2024
-
[20]
Xinyi Hou, Yanjie Zhao, Shenao Wang, and Haoyu Wang. 2025. Model context protocol (mcp): Landscape, security threats, and future re- search directions.arXiv preprint arXiv:2503.23278(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[21]
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. 2022. Lora: Low-rank adaptation of large language models.ICLR1, 2 (2022), 3
work page 2022
-
[22]
Sijie Ji, Xinzhe Zheng, and Chenshu Wu. 2024. Hargpt: Are llms zero- shot human activity recognizers?. In2024 IEEE International Workshop on Foundation Models for Cyber-Physical Systems & Internet of Things (FMSys). IEEE, 38–43
work page 2024
-
[23]
Glenn Jocher, Ayush Chaurasia, Alex Stoken, et al. 2022. ultralytics/y- olov5: v7. 0-YOLOv5 SOTA realtime instance segmentation, November 2022.Retrieved February3 (2022), 2023
work page 2022
-
[24]
Yubin Kim, Xuhai Xu, Daniel McDuff, Cynthia Breazeal, and Hae Won Park. 2024. Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data. InConference on Health, Inference, and Learning. PMLR, 522–539
work page 2024
-
[25]
Evan King, Haoxiang Yu, Sangsu Lee, and Christine Julien. 2024. Sasha: creative goal-oriented reasoning in smart homes with large language models.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies8, 1 (2024), 1–38
work page 2024
-
[26]
Sunjae Lee, Junyoung Choi, Jungjae Lee, Munim Hasan Wasi, Hojun Choi, Steve Ko, Sangeun Oh, and Insik Shin. 2024. Mobilegpt: Aug- menting llm with human-like app memory for mobile task automation. InProceedings of the 30th Annual International Conference on Mobile Computing and Networking. 1119–1133
work page 2024
- [27]
-
[28]
Yaniv Leviathan, Matan Kalman, and Yossi Matias. 2023. Fast inference from transformers via speculative decoding. InInternational Conference on Machine Learning. PMLR, 19274–19286
work page 2023
-
[29]
Minghao Li, Yingxiu Zhao, Bowen Yu, Feifan Song, Hangyu Li, Haiyang Yu, Zhoujun Li, Fei Huang, and Yongbin Li. 2023. API-Bank: A Compre- hensive Benchmark for Tool-Augmented LLMs. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 3102–3116
work page 2023
-
[30]
Yuanqi Li, Arthi Padmanabhan, Pengzhan Zhao, Yufei Wang, Guo- qing Harry Xu, and Ravi Netravali. 2020. Reducto: On-camera filtering for resource-efficient real-time video analytics. InProceedings of the Annual conference of the ACM Special Interest Group on Data Commu- nication on the applications, technologies, architectures, and protocols for computer c...
work page 2020
-
[31]
Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guo- hong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, et al. 2024. Personal llm agents: Insights and survey about the capability, efficiency and security.arXiv preprint arXiv:2401.05459(2024)
work page internal anchor Pith review arXiv 2024
-
[32]
Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. InText summarization branches out. 74–81
work page 2004
-
[33]
Ji Lin, Hongxu Yin, Wei Ping, Pavlo Molchanov, Mohammad Shoeybi, and Song Han. 2024. Vila: On pre-training for visual language models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 26689–26699
work page 2024
-
[34]
Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. 2023. Visual Instruction Tuning
work page 2023
-
[35]
Kaiwei Liu, Bufang Yang, Lilin Xu, Yunqi Guo, Neiwen Ling, Zhihe Zhao, Guoliang Xing, Xian Shuai, Xiaozhe Ren, Xin Jiang, et al. 2024. Tasking Heterogeneous Sensor Systems with LLMs. InProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems. 901–902. 13
work page 2024
-
[36]
Kaiwei Liu, Bufang Yang, Lilin Xu, Yunqi Guo, Guoliang Xing, Xian Shuai, Xiaozhe Ren, Xin Jiang, and Zhenyu Yan. 2025. TaskSense: A Translation-like Approach for Tasking Heterogeneous Sensor Systems with LLMs. InProceedings of the 23rd ACM Conference on Embedded Networked Sensor Systems. 213–225
work page 2025
-
[37]
Hong Lu, AJ Bernheim Brush, Bodhi Priyantha, Amy K Karlson, and Jie Liu. 2011. Speakersense: Energy efficient unobtrusive speaker identification on mobile phones. InPervasive Computing: 9th Interna- tional Conference, Pervasive 2011, San Francisco, USA, June 12-15, 2011. Proceedings 9. Springer, 188–205
work page 2011
- [38]
-
[39]
Andres Marafioti, Merve Noyan, Miquel Farré, Elie Bakouch, and Pedro Cuenca. 2024. Smolvlm-small yet mighty vision language model
work page 2024
-
[40]
Marina Neseem, Jon Nelson, and Sherief Reda. 2020. AdaSense: adap- tive low-power sensing and activity recognition for wearable devices. In2020 57th ACM/IEEE Design Automation Conference (DAC). IEEE, 1–6
work page 2020
-
[41]
Xiaomin Ouyang and Mani Srivastava. 2024. LLMSense: Harnessing LLMs for high-level reasoning over spatiotemporal sensor traces. In 2024 IEEE 3rd Workshop on Machine Learning on Edge in Sensor Systems (SenSys-ML). IEEE, 9–14
work page 2024
-
[42]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. InPro- ceedings of the 40th annual meeting of the Association for Computational Linguistics. 311–318
work page 2002
-
[43]
Kevin Post, Reo Kuchida, Mayowa Olapade, Zhigang Yin, Petteri Nurmi, and Huber Flores. 2025. ContextLLM: Meaningful Context Reasoning from Multi-Sensor and Multi-Device Data Using LLMs. InProceed- ings of ACM HOTMOBILE’25. Association for Computing Machinery (ACM)
work page 2025
-
[44]
Jianing Qiu, Kyle Lam, Guohao Li, Amish Acharya, Tien Yin Wong, Ara Darzi, Wu Yuan, and Eric J Topol. 2024. LLM-based agentic systems in medicine and healthcare.Nature Machine Intelligence6, 12 (2024), 1418–1420
work page 2024
-
[45]
Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence em- beddings using siamese bert-networks.arXiv preprint arXiv:1908.10084 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[46]
Shuai Shao, Zeming Li, Tianyuan Zhang, Chao Peng, Gang Yu, Xiangyu Zhang, Jing Li, and Jian Sun. 2019. Objects365: A large-scale, high- quality dataset for object detection. InProceedings of the IEEE/CVF international conference on computer vision. 8430–8439
work page 2019
- [47]
-
[48]
Silero Team. 2024. Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier. https://github.com/snakers4/silero-vad
work page 2024
- [49]
-
[50]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. Chain-of-thought prompt- ing elicits reasoning in large language models.Advances in neural information processing systems35 (2022), 24824–24837
work page 2022
-
[51]
Hao Wen, Yuanchun Li, Guohong Liu, Shanhui Zhao, Tao Yu, Toby Jia-Jun Li, Shiqi Jiang, Yunhao Liu, Yaqin Zhang, and Yunxin Liu. 2024. Autodroid: Llm-powered task automation in android. InProceedings of the 30th Annual International Conference on Mobile Computing and Networking. 543–557
work page 2024
-
[52]
Huatao Xu, Liying Han, Qirui Yang, Mo Li, and Mani Srivastava. 2024. Penetrative ai: Making llms comprehend the physical world. InProceed- ings of the 25th International Workshop on Mobile Computing Systems and Applications. 1–7
work page 2024
- [53]
-
[54]
Bufang Yang, Yunqi Guo, Lilin Xu, Zhenyu Yan, Hongkai Chen, Guo- liang Xing, and Xiaofan Jiang. 2025. SocialMind: LLM-based Proactive AR Social Assistive System with Human-like Perception for In-situ Live Interactions.Proceedings of the ACM on Interactive, Mobile, Wear- able and Ubiquitous Technologies9, 1 (2025), 1–30
work page 2025
-
[55]
Bufang Yang, Lixing He, Neiwen Ling, Zhenyu Yan, Guoliang Xing, Xian Shuai, Xiaozhe Ren, and Xin Jiang. 2023. Edgefm: Leveraging foundation model for open-set learning on the edge. InProceedings of the 21st ACM Conference on Embedded Networked Sensor Systems. 111–124
work page 2023
-
[56]
Bufang Yang, Lixing He, Kaiwei Liu, and Zhenyu Yan. 2024. Viassist: Adapting multi-modal large language models for users with visual impairments. In2024 IEEE International Workshop on Foundation Models for Cyber-Physical Systems & Internet of Things (FMSys). IEEE, 32–37
work page 2024
-
[57]
Bufang Yang, Siyang Jiang, Lilin Xu, Kaiwei Liu, Hai Li, Guoliang Xing, Hongkai Chen, Xiaofan Jiang, and Zhenyu Yan. 2024. Drhouse: An llm-empowered diagnostic reasoning system through harnessing outcomes from sensor data and expert knowledge.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies8, 4 (2024), 1–29
work page 2024
-
[58]
Bufang Yang, Lilin Xu, Liekang Zeng, Kaiwei Liu, Siyang Jiang, Wenrui Lu, Hongkai Chen, Xiaofan Jiang, Guoliang Xing, and Zhenyu Yan
-
[59]
InThe 39th Annual Conference on Neural Information Processing Systems
ContextAgent: Context-Aware Proactive LLM Agents with Open- world Sensory Perceptions. InThe 39th Annual Conference on Neural Information Processing Systems. NeurIPS
- [60]
- [61]
- [62]
-
[63]
Ceyao Zhang, Kaijie Yang, Siyi Hu, Zihao Wang, Guanghe Li, Yihang Sun, Cheng Zhang, Zhaowei Zhang, Anji Liu, Song-Chun Zhu, et al
-
[64]
InProceedings of the AAAI Conference on Artificial Intelligence, Vol
Proagent: building proactive cooperative agents with large language models. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 17591–17599
- [65]
- [66]
-
[67]
Yuwei Zhang, Tong Xia, Jing Han, Yu Wu, Georgios Rizos, Yang Liu, Mohammed Mosuily, J Ch, and Cecilia Mascolo. 2024. Towards open respiratory acoustic foundation models: Pretraining and benchmark- ing.Advances in Neural Information Processing Systems37 (2024), 27024–27055. 14
work page 2024
-
[68]
Jinguo Zhu, Weiyun Wang, Zhe Chen, Zhaoyang Liu, Shenglong Ye, Lixin Gu, Hao Tian, Yuchen Duan, Weijie Su, Jie Shao, et al . 2025. Internvl3: Exploring advanced training and test-time recipes for open- source multimodal models.arXiv preprint arXiv:2504.10479(2025). 15
work page internal anchor Pith review Pith/arXiv arXiv 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.