Introduces 2WikiMultiHopQA, a multi-hop QA dataset with explicit evidence chains generated via templates and Wikidata logical rules to force and evaluate multi-hop reasoning.
ArXiv, abs/2004.07347
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
representative citing papers
TaNOS decouples table semantics from numerical structure via anonymization, sketches, and program-first self-supervision, yielding 80.13% FinQA accuracy with 10% data and near-zero cross-domain gap versus over 10pp for standard fine-tuning.
RELOOP unifies retrieval across text, tables, and KGs via hierarchical sequences and dual-agent guided iteration, reporting EM/F1 gains over baselines on HotpotQA, HybridQA/TAT-QA, and MetaQA.
citing papers explorer
-
Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps
Introduces 2WikiMultiHopQA, a multi-hop QA dataset with explicit evidence chains generated via templates and Wikidata logical rules to force and evaluate multi-hop reasoning.