pith. sign in

Screenagent: A vision language model-driven computer control agent

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

citation-role summary

background 3 baseline 1

citation-polarity summary

years

2026 4 2024 3

representative citing papers

Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

cs.CL · 2024-12-05 · conditional · novelty 6.0

Aguvis presents a pure vision-based framework for autonomous GUI agents using structured reasoning via inner monologue, a new multimodal dataset, and two-stage training to reach SOTA on offline and online benchmarks.

citing papers explorer

Showing 7 of 7 citing papers.