pith. sign in

arxiv: 1905.09531 · v2 · pith:VG4AG2AEnew · submitted 2019-05-23 · 💻 cs.CL

MCScript2.0: A Machine Comprehension Corpus Focused on Script Events and Participants

classification 💻 cs.CL
keywords comprehensioncorpusknowledgemachinemcscript2questionsscriptapprox
0
0 comments X p. Extension
pith:VG4AG2AE Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{VG4AG2AE}

Prints a linked pith:VG4AG2AE badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

We introduce MCScript2.0, a machine comprehension corpus for the end-to-end evaluation of script knowledge. MCScript2.0 contains approx. 20,000 questions on approx. 3,500 texts, crowdsourced based on a new collection process that results in challenging questions. Half of the questions cannot be answered from the reading texts, but require the use of commonsense and, in particular, script knowledge. We give a thorough analysis of our corpus and show that while the task is not challenging to humans, existing machine comprehension models fail to perform well on the data, even if they make use of a commonsense knowledge base. The dataset is available at http://www.sfb1102.uni-saarland.de/?page_id=2582

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.