Transforming Question Answering Datasets Into Natural Language Inference Datasets

Dorottya Demszky; Kelvin Guu; Percy Liang

arxiv: 1809.02922 · v2 · pith:KGZFKOWInew · submitted 2018-09-09 · 💻 cs.CL

Transforming Question Answering Datasets Into Natural Language Inference Datasets

Dorottya Demszky , Kelvin Guu , Percy Liang This is my paper

classification 💻 cs.CL

keywords datasetsinferencelanguageansweringautomaticallydatasetnaturalquestion

0 comments

read the original abstract

Existing datasets for natural language inference (NLI) have propelled research on language understanding. We propose a new method for automatically deriving NLI datasets from the growing abundance of large-scale question answering datasets. Our approach hinges on learning a sentence transformation model which converts question-answer pairs into their declarative forms. Despite being primarily trained on a single QA dataset, we show that it can be successfully applied to a variety of other QA resources. Using this system, we automatically derive a new freely available dataset of over 500k NLI examples (QA-NLI), and show that it exhibits a wide range of inference phenomena rarely seen in previous NLI datasets.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
cs.CL 2019-05 accept novelty 7.0

BoolQ introduces naturally occurring yes/no questions as a challenging benchmark where BERT fine-tuned on MultiNLI reaches 80.4% accuracy against 90% human performance.
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
cs.CL 2018-04 unverdicted novelty 7.0

GLUE is a multi-task benchmark for general natural language understanding that includes a diagnostic test suite and finds limited gains from current multi-task learning methods over single-task training.
Compositional Consistency-Guided Decoding for Three-Way Logical Question Answering
cs.CL 2026-03 unverdicted novelty 6.0

CGD-PD improves three-way logical QA accuracy by up to 16% relative on FOLIO through negation-consistent projection and proof-driven disambiguation that reduces Unknown predictions across frontier LLMs.
Ultra-Low-Dimensional Prompt Tuning via Random Projection
cs.CL 2025-02 unverdicted novelty 6.0

ULPT optimizes prompts in ultra-low dimensions with frozen random up-projection to cut training parameters by 98% while matching vanilla prompt tuning performance on NLP tasks.
Vanishing Contributions: A Unified Framework for Smooth and Iterative Model Compression
cs.LG 2025-10 unverdicted novelty 5.0

VCON is a unified framework for smooth iterative DNN compression that uses parallel execution and an affine combination to progressively replace the original model with its compressed form during fine-tuning.