FOCAL cuts token use by 60% and VLM calls by 72% on desktop streams while raising key recall from 0.38 to 0.61 and staying robust to task switches that break baselines.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
SAGE is a training-free context reduction method that converts attention signals from a small LLM into a differential relevance heatmap to select top units for downstream QA, achieving competitive accuracy at 10% token budget on benchmarks like QuALITY-hard.
citing papers explorer
-
FOCAL: Filtered On-device Continuous Activity Logging for Efficient Personal Desktop Summarization
FOCAL cuts token use by 60% and VLM calls by 72% on desktop streams while raising key recall from 0.38 to 0.61 and staying robust to task switches that break baselines.
-
SAGE: Selective Attention-Guided Extraction for Token-Efficient Document Indexing
SAGE is a training-free context reduction method that converts attention signals from a small LLM into a differential relevance heatmap to select top units for downstream QA, achieving competitive accuracy at 10% token budget on benchmarks like QuALITY-hard.