Vimo: A generative visual gui world model for app agents

Dezhao Luo, Bohan Tang, Kang Li, Georgios Papoudakis, Jifei Song, Shaogang Gong, Jianye Hao, Jun Wang, Kun Shao · 2025 · arXiv 2504.13936

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language Environment Simulation

cs.CL · 2026-04-13 · unverdicted · novelty 8.0

OccuBench is a new benchmark for AI agents on real-world occupational tasks via LLM-driven simulators, showing no model dominates all industries, implicit faults are hardest, and larger models with more reasoning perform better.

Beyond Binary: Reframing GUI Critique as Continuous Semantic Alignment

cs.LG · 2026-05-14 · unverdicted · novelty 7.0 · 2 refs

BBCritic reframes GUI critique as continuous semantic alignment via contrastive learning in an affordance space, outperforming larger binary SOTA models on a new four-level hierarchical benchmark without extra annotations.

UI-Oceanus: Scaling GUI Agents with Synthetic Environmental Dynamics

cs.LG · 2026-02-11 · unverdicted · novelty 5.0

UI-Oceanus shows that continual pre-training on forward dynamics predictions from synthetic GUI exploration improves agent success rates by 7% offline and 16.8% online, with gains scaling by data volume.

How Mobile World Model Guides GUI Agents?

cs.AI · 2026-05-11 · unverdicted · novelty 4.0 · 2 refs

World models trained on delta text, full text, diffusion images, and renderable code achieve SoTA on two benchmarks and improve downstream GUI agent performance on three mobile datasets with modality-specific strengths.

Securing Computer-Use Agents: A Unified Architecture-Lifecycle Framework for Deployment-Grounded Reliability

cs.CL · 2026-05-08 · unverdicted · novelty 4.0

The paper develops a unified framework that organizes computer-use agent reliability around perception-decision-execution layers and creation-deployment-operation-maintenance stages to map security and alignment interventions.

Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

cs.AI · 2026-04-24

citing papers explorer

Showing 6 of 6 citing papers.

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language Environment Simulation cs.CL · 2026-04-13 · unverdicted · none · ref 10
OccuBench is a new benchmark for AI agents on real-world occupational tasks via LLM-driven simulators, showing no model dominates all industries, implicit faults are hardest, and larger models with more reasoning perform better.
Beyond Binary: Reframing GUI Critique as Continuous Semantic Alignment cs.LG · 2026-05-14 · unverdicted · none · ref 97 · 2 links
BBCritic reframes GUI critique as continuous semantic alignment via contrastive learning in an affordance space, outperforming larger binary SOTA models on a new four-level hierarchical benchmark without extra annotations.
UI-Oceanus: Scaling GUI Agents with Synthetic Environmental Dynamics cs.LG · 2026-02-11 · unverdicted · none · ref 27
UI-Oceanus shows that continual pre-training on forward dynamics predictions from synthetic GUI exploration improves agent success rates by 7% offline and 16.8% online, with gains scaling by data volume.
How Mobile World Model Guides GUI Agents? cs.AI · 2026-05-11 · unverdicted · none · ref 10 · 2 links
World models trained on delta text, full text, diffusion images, and renderable code achieve SoTA on two benchmarks and improve downstream GUI agent performance on three mobile datasets with modality-specific strengths.
Securing Computer-Use Agents: A Unified Architecture-Lifecycle Framework for Deployment-Grounded Reliability cs.CL · 2026-05-08 · unverdicted · none · ref 55
The paper develops a unified framework that organizes computer-use agent reliability around perception-decision-execution layers and creation-deployment-operation-maintenance stages to map security and alignment interventions.
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond cs.AI · 2026-04-24 · unreviewed · ref 256

Vimo: A generative visual gui world model for app agents

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer