AQuaUI uses adaptive quadtrees to cut visual tokens in GUI-agent LMMs by up to 29.52% at inference time while retaining 99.06% of full-token accuracy on grounding and navigation benchmarks.
Focusui: Efficient ui grounding via position-preserving visual token selection
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
ReVision reduces token usage by 46% and improves success rate by 3% on OSWorld, WebTailBench, and AgentNetBench by removing redundant visual patches from 5-history trajectories with Qwen2.5-VL-7B.
citing papers explorer
-
AQuaUI: Visual Token Reduction for GUI Agents with Adaptive Quadtrees
AQuaUI uses adaptive quadtrees to cut visual tokens in GUI-agent LMMs by up to 29.52% at inference time while retaining 99.06% of full-token accuracy on grounding and navigation benchmarks.
-
ReVision: Scaling Computer-Use Agents via Temporal Visual Redundancy Reduction
ReVision reduces token usage by 46% and improves success rate by 3% on OSWorld, WebTailBench, and AgentNetBench by removing redundant visual patches from 5-history trajectories with Qwen2.5-VL-7B.