EQuANt (Enhanced Question Answer Network)

Dominic Danks; Fran\c{c}ois-Xavier Aubet; Yuchen Zhu

arxiv: 1907.00708 · v2 · pith:VOSGX6BJnew · submitted 2019-06-24 · 💻 cs.CL · cs.LG· stat.ML

EQuANt (Enhanced Question Answer Network)

Fran\c{c}ois-Xavier Aubet , Dominic Danks , Yuchen Zhu This is my paper

Pith reviewed 2026-05-25 17:52 UTC · model grok-4.3

classification 💻 cs.CL cs.LGstat.ML

keywords machine reading comprehensionquestion answeringSQuADunanswerable questionsQANetneural network models

0 comments

The pith

EQuANt extends QANet to detect unanswerable questions and nearly doubles performance over a lightweight baseline on SQuAD 2.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents EQuANt as an extension of the QANet model designed to handle questions that have no answer in the provided context. It trains and tests this model on the SQuAD 2 dataset, which mixes answerable and unanswerable questions. Results show EQuANt reaches close to twice the performance of a lightweight original QANet version on this dataset. The work also finds that training on SQuAD 2 improves results when later evaluated on SQuAD 1.1, pointing to gains from multi-task learning in reading comprehension.

Core claim

EQuANt shows it is possible to extend QANet to the unanswerable domain, achieving results close to 2 times better than a lightweight QANet baseline on SQuAD 2 while also demonstrating that training on SQuAD 2 boosts performance on SQuAD 1.1 over training only on SQuAD 1.1.

What carries the argument

Specific architectural extensions added to the original QANet model to enable detection and handling of unanswerable questions.

If this is right

EQuANt reaches nearly double the score of the lightweight QANet baseline on SQuAD 2.
Training EQuANt on SQuAD 2 improves its results when tested on SQuAD 1.1 compared with training only on SQuAD 1.1.
Multi-task learning across answerable and unanswerable questions benefits model performance in machine reading comprehension.
The same extension approach can be applied to other models that currently lack support for unanswerable questions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The multi-task benefit observed here suggests similar gains could appear when mixing other question-answering datasets that differ in answerability.
If the extensions generalize, they could be tested on newer reading-comprehension benchmarks that include unanswerable items.
Further work could isolate which single extension contributes most to the improvement by ablating them one at a time.

Load-bearing premise

The chosen extensions to QANet are what allow it to handle unanswerable questions, and the lightweight QANet version is a fair baseline for comparison.

What would settle it

Running the original QANet or another unmodified baseline on SQuAD 2 and finding performance comparable to EQuANt would show the extensions did not drive the reported gains.

Figures

Figures reproduced from arXiv: 1907.00708 by Dominic Danks, Fran\c{c}ois-Xavier Aubet, Yuchen Zhu.

**Figure 1.** Figure 1: The mechanism of one QANet encoder block. First identify the value of the indicator variable b ∈ {0, 1}, such that if the answer exists, then b = 1, otherwise b = 0. Furthermore, if the context contains the answer, then identify i, j ∈ {1, ..., n}, i ≤ j, such that the span A = ci , ..., cj is the answer to the query Q. (??) Inspecting the QANet architecture, it is not hard to see that the model would not… view at source ↗

**Figure 2.** Figure 2: EQuANt architecture: combination of QANet and unanswerability extension module. performance on SQuAD2, but instead to show that it is possible to successfully extend QANet to the unanswerable domain. As in the original paper, our character embeddings are trainable and are initialised by truncating GloVe vectors (Pennington et al., 2014). However, in the interest of model size, we choose to retain p 0 2 = … view at source ↗

**Figure 3.** Figure 3: Attempts to extend QANet to EQuANt. Upon inspection of the intermediate outputs of the QANet architecture, we found that QANet respects the variable length of input queries and contexts, resulting in all intermediate outputs of the architecture having variable size. Whilst this is compatible with QANet’s original aim of assigning probabilities to every word in the context, it is not immediately compatib… view at source ↗

**Figure 4.** Figure 4: Attention maps. Top: Unanswerable question. Middle: Answerable question. Bottom: Shuffled question. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

read the original abstract

Machine Reading Comprehension (MRC) is an important topic in the domain of automated question answering and in natural language processing more generally. Since the release of the SQuAD 1.1 and SQuAD 2 datasets, progress in the field has been particularly significant, with current state-of-the-art models now exhibiting near-human performance at both answering well-posed questions and detecting questions which are unanswerable given a corresponding context. In this work, we present Enhanced Question Answer Network (EQuANt), an MRC model which extends the successful QANet architecture of Yu et al. to cope with unanswerable questions. By training and evaluating EQuANt on SQuAD 2, we show that it is indeed possible to extend QANet to the unanswerable domain. We achieve results which are close to 2 times better than our chosen baseline obtained by evaluating a lightweight version of the original QANet architecture on SQuAD 2. In addition, we report that the performance of EQuANt on SQuAD 1.1 after being trained on SQuAD2 exceeds that of our lightweight QANet architecture trained and evaluated on SQuAD 1.1, demonstrating the utility of multi-task learning in the MRC context.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The claimed 2x gain on SQuAD 2 is hard to credit to the extensions because the baseline is a deliberately lightweight QANet.

read the letter

The main thing here is that EQuANt is reported to do nearly twice as well as the baseline on SQuAD 2, but the baseline is a lightweight version of the original QANet. That choice makes it unclear how much the new components for unanswerable questions are actually contributing versus just having a stronger model. They extend QANet to handle SQuAD 2 and show that multi-task training on SQuAD 2 improves results on SQuAD 1.1 compared to the lightweight version. The idea of adapting an existing architecture this way is straightforward and the multi-task observation is a small positive. The paper falls short on details. There is no description of the specific changes made to QANet, no hyperparameters, and no error bars on the results. This makes the numbers hard to trust or build on. The citation to Yu et al. is appropriate, and there is no sign of circular reasoning in the claims. The work is a routine adaptation without new theoretical grounding or extensive experiments. Readers already deep in MRC might pick up the multi-task point, but it does not seem to offer enough for most people to cite or follow up on. I would not recommend sending this to peer review. The baseline issue and missing implementation details are too central to overlook.

Referee Report

3 major / 0 minor

Summary. The manuscript introduces EQuANt as an extension of the QANet architecture to address unanswerable questions in machine reading comprehension. It reports that training and evaluating EQuANt on SQuAD 2 yields performance close to twice that of a lightweight QANet baseline on the same dataset, and that multi-task training on SQuAD 2 improves EQuANt's results on SQuAD 1.1 relative to the lightweight baseline trained directly on SQuAD 1.1.

Significance. If the reported gains can be attributed to the proposed extensions rather than baseline capacity differences and are supported by full implementation details, the work would demonstrate a concrete adaptation of QANet for the unanswerable-question setting and highlight multi-task benefits across SQuAD variants. At present the absence of these details prevents evaluation of whether the central empirical claims hold.

major comments (3)

[Abstract] Abstract: the headline claim that EQuANt achieves results 'close to 2 times better' than the baseline supplies neither the exact metric values (F1/EM), error bars, nor any numerical comparison; without these the central empirical result cannot be verified.
[Abstract] Abstract: the baseline is characterized only as 'a lightweight version of the original QANet architecture' with no specification of the capacity reductions (layer count, hidden size, parameter count, or other architectural changes). This is load-bearing for the 2x improvement claim, because the observed gains could arise simply from restoring capacity that was removed in the baseline rather than from the unanswerable-question extensions.
[Abstract] Abstract: no implementation details are provided for the modifications made to QANet, the training hyperparameters, or the exact procedure used to obtain the multi-task SQuAD 1.1 result; these omissions prevent assessment of whether the architectural extensions are sufficient to handle unanswerable questions.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment below and will revise the abstract to incorporate the requested details for improved clarity and verifiability.

read point-by-point responses

Referee: [Abstract] Abstract: the headline claim that EQuANt achieves results 'close to 2 times better' than the baseline supplies neither the exact metric values (F1/EM), error bars, nor any numerical comparison; without these the central empirical result cannot be verified.

Authors: We agree that the abstract should report the precise F1 and EM values for both EQuANt and the baseline on SQuAD 2, together with any available error bars from repeated runs and the direct numerical ratio. The revised abstract will include these exact figures and comparisons. revision: yes
Referee: [Abstract] Abstract: the baseline is characterized only as 'a lightweight version of the original QANet architecture' with no specification of the capacity reductions (layer count, hidden size, parameter count, or other architectural changes). This is load-bearing for the 2x improvement claim, because the observed gains could arise simply from restoring capacity that was removed in the baseline rather than from the unanswerable-question extensions.

Authors: The observation is correct. The revised abstract will explicitly state the capacity reductions used for the lightweight baseline, including layer count, hidden size, and resulting parameter count, to demonstrate that the reported gains stem from the proposed extensions. revision: yes
Referee: [Abstract] Abstract: no implementation details are provided for the modifications made to QANet, the training hyperparameters, or the exact procedure used to obtain the multi-task SQuAD 1.1 result; these omissions prevent assessment of whether the architectural extensions are sufficient to handle unanswerable questions.

Authors: We acknowledge the need for these details in the abstract. The revision will add a concise description of the QANet modifications for unanswerable questions, the training hyperparameters, and the multi-task procedure, while retaining full specifications in the methods section. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical results on public benchmarks

full rationale

The paper describes an architectural extension to QANet and reports performance numbers obtained by training and evaluating on the public SQuAD 1.1 and SQuAD 2 benchmarks. No equations, fitted parameters, or derivation steps are presented that reduce any reported quantity to a definition, a self-citation chain, or an input by construction. The comparison is against a separately implemented lightweight baseline; this is an empirical claim whose validity can be checked externally and does not collapse into self-reference. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract; no free parameters, axioms, or invented entities are described.

pith-pipeline@v0.9.0 · 5768 in / 1134 out tokens · 42147 ms · 2026-05-25T17:52:39.423011+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

EQuANt 3 ... two encoder transformations ... three feedforward layers ... global mean pooling ... answerability score
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

loss function ... L0(p0) + δ(L1(p1) + L2(p2)) ... Adam optimiser

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · 12 internal anchors

[1]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

BERT: pre-training of deep bidirectional transformers for language under- standing. CoRR, abs/1810.04805. Bhuwan Dhingra, Hanxiao Liu, William W. Cohen, and Ruslan Salakhutdinov

work page internal anchor Pith review Pith/arXiv arXiv
[2]

Gated-Attention Readers for Text Comprehension

Gated-attention read- ers for text comprehension. CoRR, abs/1606.01549. Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston

work page internal anchor Pith review Pith/arXiv arXiv
[3]

The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations

The goldilocks principle: Reading children’s books with explicit memory representa- tions. CoRR, abs/1511.02301. Minghao Hu, Furu Wei, Yuxing Peng, Zhen Huang, Nan Yang, and Ming Zhou

work page internal anchor Pith review Pith/arXiv arXiv
[4]

Read + Verify: Machine Reading Comprehension with Unanswerable Questions

Read + verify: Machine reading comprehension with unanswerable questions. CoRR, abs/1808.05759. Omer Levy, Minjoon Seo, Eunsol Choi, and Luke Zettlemoyer

work page internal anchor Pith review Pith/arXiv arXiv
[5]

Zero-Shot Relation Extraction via Reading Comprehension

Zero-shot relation extraction via reading comprehension. CoRR, abs/1706.04115. Xiaodong Liu, Wei Li, Yuwei Fang, Aerin Kim, Kevin Duh, and Jianfeng Gao

work page internal anchor Pith review Pith/arXiv arXiv
[6]

Stochastic Answer Networks for SQuAD 2.0

Stochastic answer networks for squad 2.0. CoRR, abs/1809.09194. Xiaodong Liu, Yelong Shen, Kevin Duh, and Jianfeng Gao

work page internal anchor Pith review Pith/arXiv arXiv
[7]

Stochastic Answer Networks for Machine Reading Comprehension

Stochastic answer networks for machine reading comprehension. CoRR, abs/1712.03556. Jeffrey Pennington, Richard Socher, and Christo- pher D. Manning

work page internal anchor Pith review Pith/arXiv arXiv
[8]

Know What You Don't Know: Unanswerable Questions for SQuAD

Know what you don’t know: Unanswerable ques- tions for squad. CoRR, abs/1806.03822. Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang

work page internal anchor Pith review Pith/arXiv arXiv
[9]

SQuAD: 100,000+ Questions for Machine Comprehension of Text

Squad: 100, 000+ ques- tions for machine comprehension of text. CoRR, abs/1606.05250. Min Joon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi

work page internal anchor Pith review Pith/arXiv arXiv
[10]

Bidirectional Attention Flow for Machine Comprehension

Bidirectional at- tention ﬂow for machine comprehension. CoRR, abs/1611.01603. Fu Sun, Linyang Li, Xipeng Qiu, and Yang Liu

work page internal anchor Pith review Pith/arXiv arXiv
[11]

U-Net: Machine Reading Comprehension with Unanswerable Questions

U-net: Machine reading comprehension with unan- swerable questions. CoRR, abs/1810.06638. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin

work page internal anchor Pith review Pith/arXiv arXiv
[12]

QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

Qanet: Combining local convolution with global self-attention for reading comprehen- sion. CoRR, abs/1804.09541

work page internal anchor Pith review Pith/arXiv arXiv

[1] [1]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

BERT: pre-training of deep bidirectional transformers for language under- standing. CoRR, abs/1810.04805. Bhuwan Dhingra, Hanxiao Liu, William W. Cohen, and Ruslan Salakhutdinov

work page internal anchor Pith review Pith/arXiv arXiv

[2] [2]

Gated-Attention Readers for Text Comprehension

Gated-attention read- ers for text comprehension. CoRR, abs/1606.01549. Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston

work page internal anchor Pith review Pith/arXiv arXiv

[3] [3]

The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations

The goldilocks principle: Reading children’s books with explicit memory representa- tions. CoRR, abs/1511.02301. Minghao Hu, Furu Wei, Yuxing Peng, Zhen Huang, Nan Yang, and Ming Zhou

work page internal anchor Pith review Pith/arXiv arXiv

[4] [4]

Read + Verify: Machine Reading Comprehension with Unanswerable Questions

Read + verify: Machine reading comprehension with unanswerable questions. CoRR, abs/1808.05759. Omer Levy, Minjoon Seo, Eunsol Choi, and Luke Zettlemoyer

work page internal anchor Pith review Pith/arXiv arXiv

[5] [5]

Zero-Shot Relation Extraction via Reading Comprehension

Zero-shot relation extraction via reading comprehension. CoRR, abs/1706.04115. Xiaodong Liu, Wei Li, Yuwei Fang, Aerin Kim, Kevin Duh, and Jianfeng Gao

work page internal anchor Pith review Pith/arXiv arXiv

[6] [6]

Stochastic Answer Networks for SQuAD 2.0

Stochastic answer networks for squad 2.0. CoRR, abs/1809.09194. Xiaodong Liu, Yelong Shen, Kevin Duh, and Jianfeng Gao

work page internal anchor Pith review Pith/arXiv arXiv

[7] [7]

Stochastic Answer Networks for Machine Reading Comprehension

Stochastic answer networks for machine reading comprehension. CoRR, abs/1712.03556. Jeffrey Pennington, Richard Socher, and Christo- pher D. Manning

work page internal anchor Pith review Pith/arXiv arXiv

[8] [8]

Know What You Don't Know: Unanswerable Questions for SQuAD

Know what you don’t know: Unanswerable ques- tions for squad. CoRR, abs/1806.03822. Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang

work page internal anchor Pith review Pith/arXiv arXiv

[9] [9]

SQuAD: 100,000+ Questions for Machine Comprehension of Text

Squad: 100, 000+ ques- tions for machine comprehension of text. CoRR, abs/1606.05250. Min Joon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi

work page internal anchor Pith review Pith/arXiv arXiv

[10] [10]

Bidirectional Attention Flow for Machine Comprehension

Bidirectional at- tention ﬂow for machine comprehension. CoRR, abs/1611.01603. Fu Sun, Linyang Li, Xipeng Qiu, and Yang Liu

work page internal anchor Pith review Pith/arXiv arXiv

[11] [11]

U-Net: Machine Reading Comprehension with Unanswerable Questions

U-net: Machine reading comprehension with unan- swerable questions. CoRR, abs/1810.06638. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin

work page internal anchor Pith review Pith/arXiv arXiv

[12] [12]

QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

Qanet: Combining local convolution with global self-attention for reading comprehen- sion. CoRR, abs/1804.09541

work page internal anchor Pith review Pith/arXiv arXiv