ScreenLLM: Stateful screen schema for efficient action understanding and prediction.arXiv preprint arXiv:2503.20978,

Yiqiao Jin et al · arXiv 2503.20978

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Agent-Computer Observation Interfaces Enable Dynamic Computer Use

cs.AI · 2026-06-28 · conditional · novelty 7.0

AOI adds keyframe capture, volume-gated audio transcription, and visual narration to computer-use agents, producing +17 to +48 pp gains over screenshot baselines on DynaCU-Bench with no retraining.

citing papers explorer

Showing 1 of 1 citing paper.

Agent-Computer Observation Interfaces Enable Dynamic Computer Use cs.AI · 2026-06-28 · conditional · none · ref 7
AOI adds keyframe capture, volume-gated audio transcription, and visual narration to computer-use agents, producing +17 to +48 pp gains over screenshot baselines on DynaCU-Bench with no retraining.

ScreenLLM: Stateful screen schema for efficient action understanding and prediction.arXiv preprint arXiv:2503.20978,

fields

years

verdicts

representative citing papers

citing papers explorer