Do Proactive Agents Really Need an LLM to Decide When to Wake and What to Anchor?

Amir H. Abdi; Jing Gao; Michel Galley; Ruowang Zhang; Siheng Xiong; Xiaoqian Wang; Xiaoze Liu; Zhikai Chen

arxiv: 2605.30152 · v1 · pith:BNJBJDUXnew · submitted 2026-05-28 · 💻 cs.CL · cs.AI· cs.HC

Do Proactive Agents Really Need an LLM to Decide When to Wake and What to Anchor?

Xiaoze Liu , Ruowang Zhang , Amir H. Abdi , Michel Galley , Zhikai Chen , Siheng Xiong , Xiaoqian Wang , Jing Gao This is my paper

Pith reviewed 2026-06-29 07:42 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.HC

keywords proactive agentstemporal graph learningevent streamstrigger detectionLLM efficiencyon-device deploymentgraph neural networks

0 comments

The pith

A small temporal graph model on OS event streams outperforms LLM-based triggers for proactive agents.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that proactive agents do not need to query an LLM on every user activity event to decide when to wake up. Instead, the always-on signal is a structured graph of events that the OS maintains, which a small TGL model can process directly to output a trigger probability. This allows the LLM to be called only for generating the actual response after the trigger fires, leading to better accuracy and much lower latency. The TGL approach improves F1 scores across 14 different backbones and runs significantly faster with a tiny memory footprint suitable for on-device use.

Core claim

By treating user activity as graph updates of structured (actor, verb, object, timestamp) tuples and using a TGL encoder, the system produces a per-event trigger probability and per-entity routing score in one forward pass, resulting in higher F1 scores on 14 backbones with a mean improvement of 16.7 and up to 46.0, stronger AUCs, and inference speeds of 11.13 ms on GPU servers and 13.99 ms on laptops, 4-83 times faster than LLM triggers.

What carries the argument

Temporal graph learning (TGL) model acting as encoder on structured event streams to compute trigger probabilities and routing scores.

If this is right

TGL improves F1 on each of 14 backbones with mean gain of 16.7 points.
One TGL checkpoint achieves the strongest trigger AUCs and most stable deployed threshold.
Inference runs at 11.13 ms per event on GPU and 13.99 ms on laptop.
Approximately 4-7x faster on GPU and 12-83x faster on laptop than LLM-as-trigger setups.
Deployable with 220 MiB BF16 footprint on-device.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The separation allows the activity stream to remain local without repeated external LLM calls, improving privacy for sensitive user data.
This architecture could be extended to other real-time decision systems that monitor structured logs rather than unstructured text.
Reducing the frequency of LLM invocations may lower overall energy consumption in always-on agent deployments.

Load-bearing premise

The structured event stream of actor-verb-object-timestamp tuples already holds sufficient information to make accurate trigger decisions without additional context from an LLM.

What would settle it

A head-to-head test on the same event dataset showing that an LLM queried on text-rendered events produces higher F1 scores or AUC than the TGL model while matching or exceeding its speed.

Figures

Figures reproduced from arXiv: 2605.30152 by Amir H. Abdi, Jing Gao, Michel Galley, Ruowang Zhang, Siheng Xiong, Xiaoqian Wang, Xiaoze Liu, Zhikai Chen.

**Figure 3.** Figure 3: Ablation study. Each panel reports one backbone’s F1 for [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Per-event trigger-probability histograms on the ProactiveAgent test set, one panel per row of Table [PITH_FULL_IMAGE:figures/full_fig_p023_4.png] view at source ↗

**Figure 5.** Figure 5: Drop-in TGL trigger on the two fine-tuned [PITH_FULL_IMAGE:figures/full_fig_p023_5.png] view at source ↗

read the original abstract

Proactive agents read user activity as text and call an LLM on every event to decide whether to act. But user activity is not natively text: it is a structured event stream of (actor, verb, object, timestamp) tuples that the operating system already maintains in graph form. Rendering the structure as text and asking an LLM to recover it is a round-trip the system never had to take. We treat the always-on signal as graph updates rather than text and use a small temporal-graph-learning (TGL) model as the encoder: one forward pass yields a per-event trigger probability and a per-entity routing score, and only the downstream agent (turning a small structured handoff into a fluent user-facing sentence) is an LLM call, invoked only when the trigger fires. TGL improves F1 on each of 14 backbones (mean +16.7, up to +46.0); in trigger-architecture comparisons, one TGL checkpoint gives the strongest trigger AUCs and the most stable deployed threshold. It runs at 11.13 ms per event on a GPU server and 13.99 ms on a consumer laptop, approximately 4--7x and 12--83x faster than every single-forward LLM-as-trigger configuration tested in each regime, with an approximately 220 MiB BF16 resident footprint deployable on-device alongside the privacy-sensitive activity stream it consumes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TGL on structured event graphs cuts LLM calls for agent triggers and runs faster, but the gains need full experimental details to confirm.

read the letter

The core claim is that a small temporal graph model can handle the always-on trigger decision from raw (actor, verb, object, timestamp) streams without first turning them into text for an LLM. One forward pass gives both a trigger probability and a routing score, so the LLM only runs when something actually needs to happen. That design is the main novelty here.

The paper shows consistent F1 lifts across 14 different backbones, stronger AUC on trigger detection, and latency drops to roughly 11-14 ms per event with a 220 MiB footprint. Those speed numbers matter for on-device or low-latency agent setups, and the head-to-head comparisons against LLM-as-trigger baselines look favorable on the reported metrics.

The weakest part visible from the abstract is the lack of any protocol description, baseline definitions, or dataset characteristics. Without those, it's hard to judge how much the gains depend on particular event distributions or how the graph is built in practice. The central assumption—that the structured stream already carries the needed signal—is tested by the empirical results rather than just asserted, which is the right way to handle it.

This is aimed at people working on proactive agents or always-on interfaces who already have access to OS-level event graphs. A reader who cares about reducing LLM invocations for cost or privacy reasons will find the latency and size numbers directly useful. The work shows clear thinking on the architecture trade-off even if the underlying graph model is not itself a new contribution.

I would send it to peer review. The practical framing is sharp enough to merit referee time once the methods section is checked.

Referee Report

2 major / 2 minor

Summary. The paper claims that proactive agents can avoid per-event LLM calls for trigger decisions by instead processing the OS-maintained structured event stream (actor, verb, object, timestamp) as graph updates with a small temporal-graph-learning (TGL) encoder. One TGL forward pass produces a trigger probability and entity routing score; only when the trigger fires is an LLM invoked for the downstream response. Across 14 backbones the TGL approach improves F1 by a mean of +16.7 (up to +46.0), yields the strongest trigger AUCs and most stable deployed thresholds in architecture comparisons, and runs at 11.13 ms per event on a GPU server and 13.99 ms on a consumer laptop (4–7× and 12–83× faster than single-forward LLM triggers) with a 220 MiB BF16 footprint.

Significance. If the reported head-to-head results hold under a reproducible protocol, the work supplies direct empirical evidence that the structured graph already available to the OS contains sufficient signal for accurate trigger decisions, eliminating the text-rendering round-trip and enabling low-latency, on-device deployment. The latency and footprint numbers, together with the multi-backbone F1 gains, would constitute a concrete, falsifiable demonstration that LLM-based always-on triggering is unnecessary for this sub-task.

major comments (2)

[Abstract / Experimental section] The abstract and available text present quantitative claims (F1 deltas, AUC rankings, latency figures, threshold stability) across 14 backbones and multiple runtime regimes, yet supply no description of the experimental protocol, dataset characteristics, how baselines were implemented or prompted, train/validation/test splits, or statistical tests. Without these details the reported gains cannot be verified or reproduced.
[Trigger-architecture comparisons] The central comparison treats the TGL checkpoint as a single model evaluated against multiple LLM trigger configurations, but the manuscript does not state whether the TGL model was trained on the same event streams used for LLM evaluation or whether any hyper-parameter search or early-stopping criteria were applied uniformly; this information is required to assess whether the +16.7 mean F1 improvement is attributable to the graph representation itself.

minor comments (2)

[Abstract] The latency numbers are given to two decimal places but without error bars or number of runs; adding this information would strengthen the speed claims.
[Trigger-architecture comparisons] The phrase “one TGL checkpoint” is used without clarifying whether multiple random seeds or training runs were performed and how the reported checkpoint was selected.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on experimental transparency. We agree that additional protocol details are required for reproducibility and will expand the manuscript accordingly. Point-by-point responses follow.

read point-by-point responses

Referee: [Abstract / Experimental section] The abstract and available text present quantitative claims (F1 deltas, AUC rankings, latency figures, threshold stability) across 14 backbones and multiple runtime regimes, yet supply no description of the experimental protocol, dataset characteristics, how baselines were implemented or prompted, train/validation/test splits, or statistical tests. Without these details the reported gains cannot be verified or reproduced.

Authors: We agree the current draft omits these details. The revised manuscript will include a dedicated Experiments section specifying: dataset as OS event streams of (actor, verb, object, timestamp) tuples collected from 14 backbones; temporal splits (70/15/15) to preserve causality; LLM baselines implemented with a standardized zero-shot prompt for binary trigger decisions; uniform hyper-parameter search via grid search on validation AUC; and statistical tests via bootstrap (1000 resamples) for F1 differences. These additions will support verification of the reported metrics. revision: yes
Referee: [Trigger-architecture comparisons] The central comparison treats the TGL checkpoint as a single model evaluated against multiple LLM trigger configurations, but the manuscript does not state whether the TGL model was trained on the same event streams used for LLM evaluation or whether any hyper-parameter search or early-stopping criteria were applied uniformly; this information is required to assess whether the +16.7 mean F1 improvement is attributable to the graph representation itself.

Authors: The TGL model was trained and evaluated on identical event streams using the same temporal splits. Hyper-parameter search (learning rate, dimensions, layers) and early-stopping (validation AUC patience) were applied uniformly to TGL and LLM configurations. The F1 gains are therefore attributable to native graph encoding. The revision will add an explicit paragraph stating this protocol. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper contains no equations, derivations, or load-bearing self-citations. All reported results consist of direct empirical comparisons (F1 improvements, AUC values, latency measurements) of a TGL encoder against LLM trigger baselines on a fixed structured event stream. The central claim that the graph representation suffices is tested by those head-to-head numbers rather than derived from any fitted parameter or prior self-referential result, leaving the work self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated.

pith-pipeline@v0.9.1-grok · 5812 in / 1210 out tokens · 24998 ms · 2026-06-29T07:42:47.806527+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Audio Interaction Model
cs.SD 2026-06 unverdicted novelty 6.0

Audio-Interaction unifies offline and online audio tasks into one streaming model via the SoundFlow framework and a new 2.6M-item streaming corpus, enabling real-time instruction following and proactive responses.

Reference graph

Works this paper leans on

16 extracted references · 5 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems , pages 175–

A diary study of task switching and interrup- tions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems , pages 175–
[2]

Biplab Deka, Zifeng Huang, Chad Franzen, Joshua Hi- bschman, Daniel Afergan, Y ang Li, Jeffrey Nichols, and Ranjitha Kumar

ACM. Biplab Deka, Zifeng Huang, Chad Franzen, Joshua Hi- bschman, Daniel Afergan, Y ang Li, Jeffrey Nichols, and Ranjitha Kumar. 2017. Rico: A mobile app dataset for building data-driven design applications. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology , pages 845–854. Xiang Deng, Yu Gu, Boyuan Zheng, Shijie Ch...

2017
[3]

Advances in Neural Information Process- ing Systems, 36

Mind2web: Towards a generalist agent for the web. Advances in Neural Information Process- ing Systems, 36. 9 Y ang Deng, Lizi Liao, Zhonghua Zheng, Grace Hui Y ang, and Tat-Seng Chua. 2024. Towards human- centered proactive conversational agents. In Pro- ceedings of the 47th International ACM SIGIR Con- ference on Research and Development in Informa- tion...

2024
[4]

A Survey on Retrieval-Augmented Text Generation for Large Language Models

Matching attentional draw with utility in inter- ruption. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 41– 50. Jongyi Hong, Eui-Ho Suh, Junyoung Kim, and Su- Y eon Kim. 2009. Context-aware system for proac- tive personalized service based on context history. Expert Systems with Applications, 36(4):7448–7457. Guyue H...

work page internal anchor Pith review Pith/arXiv arXiv 2009
[5]

In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020 , pages 6669– 6683, Stroudsburg, PA, USA

Recurrent event network: Autoregressive structure inferenceover temporal knowledge graphs . In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020 , pages 6669– 6683, Stroudsburg, PA, USA. Association for Com- putational Linguistics. Brennan Jones, Y an Xu, Qisheng Li, and Stefan Scherer

2020
[6]

In Extended Ab- stracts of the CHI Conference on Human Factors in Computing Systems, pages 1–7

Designing a proactive context-aware ai chat- bot for people’s long-term goals . In Extended Ab- stracts of the CHI Conference on Human Factors in Computing Systems, pages 1–7. Raghav Kapoor, Y ash Parag Butala, Melisa Russak, Jing Yu Koh, Kiran Kamble, Waseem Alshikh, and Ruslan Salakhutdinov. 2024. Omniact: A dataset and benchmark for enabling multimodal...

work page arXiv 2024
[7]

Training proactive and personalized LLM agents.arXiv preprint arXiv:2511.02208, 2025

IEEE. Weiwei Sun, Xuhui Zhou, Weihua Du, Xingyao Wang, Sean Welleck, Graham Neubig, Maarten Sap, and Yiming Y ang. 2025. Training proactive and person- alized llm agents. arXiv preprint arXiv:2511.02208. Carnegie Mellon University. Zhaoxuan Tan, Qingkai Zeng, Yijun Tian, Zheyuan Liu, Bing Yin, and Meng Jiang. 2024. Democ- ratizing large language models vi...

work page arXiv 2025
[8]

In 6th Inter- national Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings , Vancouver, BC, Canada

Graph attention networks . In 6th Inter- national Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings , Vancouver, BC, Canada. OpenReview.net. Yuke Wang, Boyuan Feng, Zheng Wang, Tong Geng, Kevin Barker, Ang Li, and Yufei Ding

2018
[9]

PASK: Toward Intent-Aware Proactive Agents with Long-Term Memory

MGG: Accelerating graph neural net- works with fine-grained intra-kernel communication- computation pipelining on multi-GPU platforms . In 17th USENIX Symposium on Operating Systems De- sign and Implementation (OSDI) . Zixuan Wang, Bo Yu, Junzhe Zhao, Wenhao Sun, Sai Hou, Shuai Liang, Xing Hu, Yinhe Han, and Yiming Gan. 2025. Karma: Augmenting embodied ai...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[10]

In The Eleventh International Con- ference on Learning Representations

ReAct: Synergizing reasoning and acting in language models. In The Eleventh International Con- ference on Learning Representations. Ceyao Zhang, Kaijie Y ang, Siyi Hu, Zihao Wang, Guanghe Li, Yihang Sun, Cheng Zhang, Zhaowei Zhang, Anji Liu, Song-Chun Zhu, Xiaojun Chang, Junge Zhang, Feng Yin, Yitao Liang, and Y aodong Y ang. 2024a. Proagent: building pro...

2024
[11]

The user opened ‘{file}’ in {app}

Appagent-pro: A proactive GUI agent system for multidomain information integration and user as- sistance. arXiv preprint arXiv:2508.18689. Shuyan Zhou, Frank F Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Y onatan Bisk, Daniel Fried, Uri Alon, and Gra- ham Neubig. 2024. Webarena: A realistic web en- vironment for building...

work page arXiv 2024
[12]

session_history: the last few user activities in this session (oldest first),→
[13]

current_observation: what the user is doing right now,→
[14]

task":"string

tgn_routing_nodes: graph-identified topics most relevant to the current activity,→ Your task: infer the user's current TASK THEME from the session history,,→ then generate ONE short, specific, actionable suggestion grounded in that theme.,→ Rules: - Use session_history to identify what the user is working on (the session theme).,→ - If tgn_routing_nodes c...
[15]

session_history: the last few user activities in this session, oldest first.,→
[16]

open the last file

current_observation: what the user is doing right now.,→ Your task: infer the user's current TASK THEME from the session history,,→ then generate ONE short, specific, actionable suggestion grounded in that theme.,→ Rules: - Use session_history to identify what the user is working on.,→ - The task must be a single sentence, concrete, max 150 characters.,→ ...

2025

[1] [1]

In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems , pages 175–

A diary study of task switching and interrup- tions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems , pages 175–

[2] [2]

Biplab Deka, Zifeng Huang, Chad Franzen, Joshua Hi- bschman, Daniel Afergan, Y ang Li, Jeffrey Nichols, and Ranjitha Kumar

ACM. Biplab Deka, Zifeng Huang, Chad Franzen, Joshua Hi- bschman, Daniel Afergan, Y ang Li, Jeffrey Nichols, and Ranjitha Kumar. 2017. Rico: A mobile app dataset for building data-driven design applications. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology , pages 845–854. Xiang Deng, Yu Gu, Boyuan Zheng, Shijie Ch...

2017

[3] [3]

Advances in Neural Information Process- ing Systems, 36

Mind2web: Towards a generalist agent for the web. Advances in Neural Information Process- ing Systems, 36. 9 Y ang Deng, Lizi Liao, Zhonghua Zheng, Grace Hui Y ang, and Tat-Seng Chua. 2024. Towards human- centered proactive conversational agents. In Pro- ceedings of the 47th International ACM SIGIR Con- ference on Research and Development in Informa- tion...

2024

[4] [4]

A Survey on Retrieval-Augmented Text Generation for Large Language Models

Matching attentional draw with utility in inter- ruption. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 41– 50. Jongyi Hong, Eui-Ho Suh, Junyoung Kim, and Su- Y eon Kim. 2009. Context-aware system for proac- tive personalized service based on context history. Expert Systems with Applications, 36(4):7448–7457. Guyue H...

work page internal anchor Pith review Pith/arXiv arXiv 2009

[5] [5]

In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020 , pages 6669– 6683, Stroudsburg, PA, USA

Recurrent event network: Autoregressive structure inferenceover temporal knowledge graphs . In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020 , pages 6669– 6683, Stroudsburg, PA, USA. Association for Com- putational Linguistics. Brennan Jones, Y an Xu, Qisheng Li, and Stefan Scherer

2020

[6] [6]

In Extended Ab- stracts of the CHI Conference on Human Factors in Computing Systems, pages 1–7

Designing a proactive context-aware ai chat- bot for people’s long-term goals . In Extended Ab- stracts of the CHI Conference on Human Factors in Computing Systems, pages 1–7. Raghav Kapoor, Y ash Parag Butala, Melisa Russak, Jing Yu Koh, Kiran Kamble, Waseem Alshikh, and Ruslan Salakhutdinov. 2024. Omniact: A dataset and benchmark for enabling multimodal...

work page arXiv 2024

[7] [7]

Training proactive and personalized LLM agents.arXiv preprint arXiv:2511.02208, 2025

IEEE. Weiwei Sun, Xuhui Zhou, Weihua Du, Xingyao Wang, Sean Welleck, Graham Neubig, Maarten Sap, and Yiming Y ang. 2025. Training proactive and person- alized llm agents. arXiv preprint arXiv:2511.02208. Carnegie Mellon University. Zhaoxuan Tan, Qingkai Zeng, Yijun Tian, Zheyuan Liu, Bing Yin, and Meng Jiang. 2024. Democ- ratizing large language models vi...

work page arXiv 2025

[8] [8]

In 6th Inter- national Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings , Vancouver, BC, Canada

Graph attention networks . In 6th Inter- national Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings , Vancouver, BC, Canada. OpenReview.net. Yuke Wang, Boyuan Feng, Zheng Wang, Tong Geng, Kevin Barker, Ang Li, and Yufei Ding

2018

[9] [9]

PASK: Toward Intent-Aware Proactive Agents with Long-Term Memory

MGG: Accelerating graph neural net- works with fine-grained intra-kernel communication- computation pipelining on multi-GPU platforms . In 17th USENIX Symposium on Operating Systems De- sign and Implementation (OSDI) . Zixuan Wang, Bo Yu, Junzhe Zhao, Wenhao Sun, Sai Hou, Shuai Liang, Xing Hu, Yinhe Han, and Yiming Gan. 2025. Karma: Augmenting embodied ai...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[10] [10]

In The Eleventh International Con- ference on Learning Representations

ReAct: Synergizing reasoning and acting in language models. In The Eleventh International Con- ference on Learning Representations. Ceyao Zhang, Kaijie Y ang, Siyi Hu, Zihao Wang, Guanghe Li, Yihang Sun, Cheng Zhang, Zhaowei Zhang, Anji Liu, Song-Chun Zhu, Xiaojun Chang, Junge Zhang, Feng Yin, Yitao Liang, and Y aodong Y ang. 2024a. Proagent: building pro...

2024

[11] [11]

The user opened ‘{file}’ in {app}

Appagent-pro: A proactive GUI agent system for multidomain information integration and user as- sistance. arXiv preprint arXiv:2508.18689. Shuyan Zhou, Frank F Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Y onatan Bisk, Daniel Fried, Uri Alon, and Gra- ham Neubig. 2024. Webarena: A realistic web en- vironment for building...

work page arXiv 2024

[12] [12]

session_history: the last few user activities in this session (oldest first),→

[13] [13]

current_observation: what the user is doing right now,→

[14] [14]

task":"string

tgn_routing_nodes: graph-identified topics most relevant to the current activity,→ Your task: infer the user's current TASK THEME from the session history,,→ then generate ONE short, specific, actionable suggestion grounded in that theme.,→ Rules: - Use session_history to identify what the user is working on (the session theme).,→ - If tgn_routing_nodes c...

[15] [15]

session_history: the last few user activities in this session, oldest first.,→

[16] [16]

open the last file

current_observation: what the user is doing right now.,→ Your task: infer the user's current TASK THEME from the session history,,→ then generate ONE short, specific, actionable suggestion grounded in that theme.,→ Rules: - Use session_history to identify what the user is working on.,→ - The task must be a single sentence, concrete, max 150 characters.,→ ...

2025