InteractWeb-Bench shows that frontier multimodal AI agents remain trapped in blind execution when generating websites from perturbed, low-quality non-expert instructions.
Typically used to wait for unfinished webpage processes, with a duration of 5 seconds
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.AI 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Agents learn to dynamically construct and organize memory from multimodal experiences, improving performance over static designs in task-dependent settings.
citing papers explorer
-
InteractWeb-Bench: Can Multimodal Agent Escape Blind Execution in Interactive Website Generation?
InteractWeb-Bench shows that frontier multimodal AI agents remain trapped in blind execution when generating websites from perturbed, low-quality non-expert instructions.
-
Learning to Learn from Multimodal Experience
Agents learn to dynamically construct and organize memory from multimodal experiences, improving performance over static designs in task-dependent settings.