GPT-4o identified only 21.2% of the usability issues found by human experts in heuristic evaluation, while discovering 27 additional issues and exhibiting difficulties with certain heuristics and generating false positives.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.HC 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Can GPT-4o Evaluate Usability Like Human Experts? A Comparative Study on Issue Identification in Heuristic Evaluation
GPT-4o identified only 21.2% of the usability issues found by human experts in heuristic evaluation, while discovering 27 additional issues and exhibiting difficulties with certain heuristics and generating false positives.