Omniact: A dataset and benchmark for enabling multimodal generalist autonomous agents for desktop and web

Raghav Kapoor, Yash Parag Butala, Melisa Russak, Jing Yu Koh, Kiran Kamble, Waseem AlShikh, Ruslan Salakhutdinov · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

CutVerse benchmark evaluates GUI agents on 186 complex media post-production tasks in seven apps and reports 36% success rate for existing models.

Showing 1 of 1 citing paper.

CutVerse: A Compositional GUI Agents Benchmark for Media Post-Production Editing cs.CV · 2026-05-19 · unverdicted · none · ref 19
CutVerse benchmark evaluates GUI agents on 186 complex media post-production tasks in seven apps and reports 36% success rate for existing models.