CommonWhy is a new dataset of 15,000 why-questions for evaluating LLMs on entity-based causal commonsense reasoning grounded in Wikidata.
CommAI: Evaluating the first steps towards a useful general AI
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
With machine learning successfully applied to new daunting problems almost every day, general AI starts looking like an attainable goal. However, most current research focuses instead on important but narrow applications, such as image classification or machine translation. We believe this to be largely due to the lack of objective ways to measure progress towards broad machine intelligence. In order to fill this gap, we propose here a set of concrete desiderata for general AI, together with a platform to test machines on how well they satisfy such desiderata, while keeping all further complexities to a minimum.
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
CommonWhy: A Dataset for Evaluating Entity-Based Causal Commonsense Reasoning in Large Language Models
CommonWhy is a new dataset of 15,000 why-questions for evaluating LLMs on entity-based causal commonsense reasoning grounded in Wikidata.