Multimodal LLMs can detect usability issues from screen recordings, explain them via Nielsen's heuristics, and rank improvement recommendations, with engineer feedback indicating practical usefulness for teams lacking experts.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
TEC is a new public dataset of detailed human trial-and-error trajectories and reflections on web tasks, with humans showing substantially higher accuracy than LLMs.
citing papers explorer
-
Recommending Usability Improvements with Multimodal Large Language Models
Multimodal LLMs can detect usability issues from screen recordings, explain them via Nielsen's heuristics, and rank improvement recommendations, with engineer feedback indicating practical usefulness for teams lacking experts.
-
TEC: A Collection of Human Trial-and-error Trajectories for Problem Solving
TEC is a new public dataset of detailed human trial-and-error trajectories and reflections on web tasks, with humans showing substantially higher accuracy than LLMs.