GPT-4o identified only 21.2% of the usability issues found by human experts in heuristic evaluation, while discovering 27 additional issues and exhibiting difficulties with certain heuristics and generating false positives.
Journal on Interactive Systems 15(1), 810–822 (Aug 2024)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.HC 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Can GPT-4o Evaluate Usability Like Human Experts? A Comparative Study on Issue Identification in Heuristic Evaluation
GPT-4o identified only 21.2% of the usability issues found by human experts in heuristic evaluation, while discovering 27 additional issues and exhibiting difficulties with certain heuristics and generating false positives.