Knowledge-incorporating ESIM models for Response Selection in Retrieval-based Dialog Systems

Jatin Ganhotra; Kshitij Fadnis; Siva Sankalp Patel

arxiv: 1907.05792 · v1 · pith:V2HB2CHOnew · submitted 2019-07-11 · 💻 cs.CL · cs.AI· cs.IR

Knowledge-incorporating ESIM models for Response Selection in Retrieval-based Dialog Systems

Jatin Ganhotra , Siva Sankalp Patel , Kshitij Fadnis This is my paper

Pith reviewed 2026-05-24 23:06 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.IR

keywords ESIMdialog systemsresponse selectionknowledge incorporationretrieval-based dialogsDSTC7Ubuntu datasetAdvising dataset

0 comments

The pith

Incorporating external knowledge and similar dialogs into ESIM improves next-utterance prediction in goal-oriented dialog systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends the Enhanced Sequential Inference Model to handle retrieval-based dialog tasks that require external information. K-ESIM adds domain knowledge directly into the model while T-ESIM pulls context from similar past conversations. Both are tested against the baseline ESIM on the Ubuntu and Advising datasets from the DSTC7 response selection track. The authors report that these additions produce measurable gains in selecting the correct next utterance from candidate lists. The work targets the practical need for dialog systems to draw on outside facts when completing goals such as reservations or course recommendations.

Core claim

The authors claim that K-ESIM, which incorporates external domain knowledge, and T-ESIM, which leverages information from similar conversations, produce performance improvements over the baseline ESIM model when predicting the next utterance in partial conversations from the Ubuntu and Advising datasets.

What carries the argument

K-ESIM and T-ESIM extensions to the ESIM architecture that integrate external domain knowledge and targeted information from similar dialogs into the inference process for response selection.

If this is right

K-ESIM enables better interaction with external knowledge sources during goal-oriented tasks such as booking or advising.
T-ESIM improves prediction by retrieving context from similar prior dialogs.
Both extensions maintain end-to-end training while increasing accuracy on candidate response selection.
The approach applies to customer-support scenarios that rely on domain-specific facts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same integration pattern could be tested on other retrieval-based NLP tasks outside dialog.
Joint use of both knowledge sources and similar-dialog retrieval might produce further additive gains if combined in one model.
The method may reduce reliance on hand-crafted features when building practical dialog systems.

Load-bearing premise

External domain knowledge and similar-dialog data can be added to the ESIM model in a way that yields net gains without introducing integration errors.

What would settle it

A head-to-head evaluation on the DSTC7 Ubuntu or Advising datasets in which K-ESIM or T-ESIM shows no accuracy gain or a loss relative to plain ESIM would falsify the reported improvements.

Figures

Figures reproduced from arXiv: 1907.05792 by Jatin Ganhotra, Kshitij Fadnis, Siva Sankalp Patel.

**Figure 1.** Figure 1: K-ESIM: A high-level overview of K-ESIM model, which incorporates external knowledge. Baseline model: ESIM We use the ESIM model proposed by Chen et al. (2017) as the baseline model. The implementation details for the baseline model are provided in Appendix. As mentioned in the ’Problem Statement’ section, the task is to select the next response given the dialog history (context). The multi-turn dialog his… view at source ↗

read the original abstract

Goal-oriented dialog systems, which can be trained end-to-end without manually encoding domain-specific features, show tremendous promise in the customer support use-case e.g. flight booking, hotel reservation, technical support, student advising etc. These dialog systems must learn to interact with external domain knowledge to achieve the desired goal e.g. recommending courses to a student, booking a table at a restaurant etc. This paper presents extended Enhanced Sequential Inference Model (ESIM) models: a) K-ESIM (Knowledge-ESIM), which incorporates the external domain knowledge and b) T-ESIM (Targeted-ESIM), which leverages information from similar conversations to improve the prediction accuracy. Our proposed models and the baseline ESIM model are evaluated on the Ubuntu and Advising datasets in the Sentence Selection track of the latest Dialog System Technology Challenge (DSTC7), where the goal is to find the correct next utterance, given a partial conversation, from a set of candidates. Our preliminary results suggest that incorporating external knowledge sources and leveraging information from similar dialogs leads to performance improvements for predicting the next utterance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds knowledge and similar-dialog signals to ESIM for response selection but the abstract supplies no numbers or ablations to show whether the changes help.

read the letter

The main move here is taking the existing ESIM architecture and creating two variants: K-ESIM that folds in external domain knowledge and T-ESIM that pulls signals from similar past conversations. Both are aimed at next-utterance selection on the DSTC7 Ubuntu and Advising sets. That is a straightforward extension rather than a new model family, and the choice of public benchmarks keeps the work grounded in reproducible data. The focus on customer-support style dialogs where knowledge matters is also sensible. Beyond that, the text stays at the level of describing the idea and stating that preliminary results suggest gains. No metrics, no ablation tables, no error analysis, and no implementation specifics appear in the abstract, so it is impossible to judge whether the added components produce net improvement or simply increase complexity. The evaluation setup itself looks clean because it relies on external datasets. This work would mainly interest people already running ESIM-style models on retrieval-based dialog who want to test quick additions of knowledge sources. It does not offer a new framework or broad theoretical shift, so readers looking for first-principles advances will not find them. The paper deserves peer review if the full version contains the missing experimental details and clear, controlled gains; based on the abstract alone the evidence is too thin to evaluate the central claim.

Referee Report

1 major / 0 minor

Summary. The paper proposes two extensions to the Enhanced Sequential Inference Model (ESIM) for response selection in retrieval-based goal-oriented dialog systems: K-ESIM, which incorporates external domain knowledge, and T-ESIM, which leverages information from similar conversations. These are evaluated against the baseline ESIM on the Ubuntu and Advising datasets from the DSTC7 Sentence Selection track, with the claim that the extensions yield performance improvements for predicting the next utterance.

Significance. If substantiated, the approach could provide a concrete method for injecting external knowledge into neural dialog models without manual feature engineering, addressing a recurring challenge in customer-support dialog systems. No machine-checked proofs, reproducible code, or parameter-free derivations are present to credit.

major comments (1)

[Abstract] Abstract: the central claim that K-ESIM and T-ESIM produce performance improvements is unsupported by any numeric results, ablation studies, error bars, or implementation details, so the improvement cannot be verified from the manuscript.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their comments on our manuscript. We respond to the major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that K-ESIM and T-ESIM produce performance improvements is unsupported by any numeric results, ablation studies, error bars, or implementation details, so the improvement cannot be verified from the manuscript.

Authors: We agree that the abstract does not currently include specific numeric results, ablation studies, error bars, or implementation details to directly support the claim of performance improvements. While the manuscript body presents the evaluation on the Ubuntu and Advising datasets from DSTC7, to ensure the central claim is verifiable from the abstract itself, we will revise the abstract in the next version to include key quantitative results and a brief mention of the experimental setup. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper defines K-ESIM and T-ESIM as architectural extensions to the existing ESIM model, then reports empirical results on the public DSTC7 Ubuntu and Advising datasets. No equations or claims reduce a prediction to a fitted input by construction, no self-citation chain is invoked to justify uniqueness or an ansatz, and the evaluation data and metrics are external to the authors' prior work. The central claim (performance improvement from knowledge incorporation) is therefore an independent experimental outcome rather than a definitional or self-referential tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Abstract supplies only high-level domain assumptions about the utility of knowledge incorporation and similar-dialog signals; no free parameters, invented entities, or formal axioms are stated.

axioms (2)

domain assumption External domain knowledge can be integrated into neural dialog models to improve performance
Central premise for proposing K-ESIM
domain assumption Information from similar conversations provides useful signals for response selection
Central premise for proposing T-ESIM

pith-pipeline@v0.9.0 · 5727 in / 1250 out tokens · 30887 ms · 2026-05-24T23:06:57.809543+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 15 internal anchors

[1]

2016] Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G

[Abadi et al. 2016] Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G. S.; Davis, A.; Dean, J.; Devin, M.; et al

work page 2016
[2]

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Tensorﬂow: Large-scale ma- chine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467. [Bartl and Spanakis 2017] Bartl, A., and Spanakis, G

work page internal anchor Pith review Pith/arXiv arXiv 2017
[3]

In Machine Learning and Applica- tions (ICMLA), 2017 16th IEEE International Conference on , 1120–1125

A retrieval-based dialogue system utilizing utterance and context embeddings. In Machine Learning and Applica- tions (ICMLA), 2017 16th IEEE International Conference on , 1120–1125. IEEE. [Bordes, Boureau, and Weston 2016] Bordes, A.; Boureau, Y .-L.; and Weston, J

work page 2017
[4]

Learning End-to-End Goal-Oriented Dialog

Learning end-to-end goal- oriented dialog. arXiv preprint arXiv:1605.07683. [Chen et al. 2017] Chen, Q.; Zhu, X.; Ling, Z.-H.; Wei, S.; Jiang, H.; and Inkpen, D

work page internal anchor Pith review Pith/arXiv arXiv 2017
[5]

In Proceedings of the 55th Annual Meet- ing of the Association for Computational Linguistics (V olume 1: Long Papers), volume 1, 1657–1668

Enhanced lstm for natural language inference. In Proceedings of the 55th Annual Meet- ing of the Association for Computational Linguistics (V olume 1: Long Papers), volume 1, 1657–1668. [Dong and Huang 2018] Dong, J., and Huang, J

work page 2018
[6]

Enhance word representation for out-of-vocabulary on Ubuntu dialogue corpus

En- hance word representation for out-of-vocabulary on ubuntu dialogue corpus. arXiv preprint arXiv:1802.02614. [dos Santos et al. 2015] dos Santos, C.; Guimaraes, V .; Niter´oi, R.; and de Janeiro, R

work page internal anchor Pith review Pith/arXiv arXiv 2015
[7]

In Proceed- ings of NEWS 2015 The Fifth Named Entities Workshop ,

Boosting named entity recognition with neural character embeddings. In Proceed- ings of NEWS 2015 The Fifth Named Entities Workshop ,

work page 2015
[8]

[Eric and Manning 2017] Eric, M., and Manning, C. D

work page 2017
[9]

Key-Value Retrieval Networks for Task-Oriented Dialogue

Key-value retrieval networks for task-oriented dialogue. arXiv preprint arXiv:1705.05414. [Ghazvininejad et al. 2017] Ghazvininejad, M.; Brockett, C.; Chang, M.-W.; Dolan, B.; Gao, J.; Yih, W.-t.; and Galley, M

work page internal anchor Pith review Pith/arXiv arXiv 2017
[10]

A Knowledge-Grounded Neural Conversation Model

A knowledge-grounded neural conversation model. arXiv preprint arXiv:1702.01932. [Hochreiter and Schmidhuber 1997] Hochreiter, S., and Schmidhuber, J

work page internal anchor Pith review Pith/arXiv arXiv 1997
[11]

Neural computation 9(8):1735–1780

Long short-term memory. Neural computation 9(8):1735–1780. [Kadlec, Schmid, and Kleindienst 2015] Kadlec, R.; Schmid, M.; and Kleindienst, J

work page 2015
[12]

Improved Deep Learning Baselines for Ubuntu Corpus Dialogs

Improved deep learn- ing baselines for ubuntu corpus dialogs. arXiv preprint arXiv:1510.03753. [Kingma and Ba 2014] Kingma, D. P., and Ba, J

work page internal anchor Pith review Pith/arXiv arXiv 2014
[13]

Adam: A Method for Stochastic Optimization

Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. [Krizhevsky, Sutskever, and Hinton 2012] Krizhevsky, A.; Sutskever, I.; and Hinton, G. E

work page internal anchor Pith review Pith/arXiv arXiv 2012
[14]

In Advances in neural information processing systems, 1097–1105

Imagenet classiﬁcation with deep convolutional neural networks. In Advances in neural information processing systems, 1097–1105. [Kummerfeld et al. 2018] Kummerfeld, J. K.; Gouravajhala, S. R.; Peper, J.; Athreya, V .; Gunasekara, C.; Ganhotra, J.; Patel, S. S.; Polymenakos, L.; and Lasecki, W. S

work page 2018
[15]

arXiv preprint arXiv:1810.11118

Ana- lyzing assumptions in conversation disentanglement research through the lens of a new dataset and model. arXiv preprint arXiv:1810.11118. [Le, Dymetman, and Renders 2016] Le, P.; Dymetman, M.; and Renders, J.-M

work page arXiv 2016
[16]

LSTM-based Mixture-of-Experts for Knowledge-Aware Dialogues

Lstm-based mixture-of- experts for knowledge-aware dialogues. arXiv preprint arXiv:1605.01652. [Li et al. 2016] Li, J.; Galley, M.; Brockett, C.; Gao, J.; and Dolan, B

work page internal anchor Pith review Pith/arXiv arXiv 2016
[17]

The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems

A diversity-promoting objective function for neural conversation models. In Proceedings of the 2016 Con- ference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 110–119. [Lowe et al. 2015a] Lowe, R.; Pow, N.; Serban, I.; Charlin, L.; and Pineau, J. 2015a. Incorporating unstructured textual knowl...

work page internal anchor Pith review Pith/arXiv arXiv 2016
[18]

Efficient Estimation of Word Representations in Vector Space

Efﬁcient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. [Pandey et al. 2018] Pandey, G.; Contractor, D.; Kumar, V .; and Joshi, S

work page internal anchor Pith review Pith/arXiv arXiv 2018
[19]

In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (V olume 1: Long Papers), volume 1, 1329–1338

Exemplar encoder-decoder for neural conversation generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (V olume 1: Long Papers), volume 1, 1329–1338. [Pennington, Socher, and Manning 2014] Pennington, J.; Socher, R.; and Manning, C

work page 2014
[20]

In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 1532–1543

Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 1532–1543. [Seo et al. 2016] Seo, M.; Min, S.; Farhadi, A.; and Hajishirzi, H

work page 2014
[21]

Query-Reduction Networks for Question Answering

Query-reduction networks for question answering. arXiv preprint arXiv:1606.04582. [Serban et al. 2016] Serban, I. V .; Sordoni, A.; Bengio, Y .; Courville, A. C.; and Pineau, J

work page internal anchor Pith review Pith/arXiv arXiv 2016
[22]

In AAAI, volume 16, 3776–3784

Building end-to- end dialogue systems using generative hierarchical neural network models. In AAAI, volume 16, 3776–3784. [Serban et al. 2017] Serban, I. V .; Sordoni, A.; Lowe, R.; Charlin, L.; Pineau, J.; Courville, A. C.; and Bengio, Y

work page 2017
[23]

In AAAI, 3295–3301

A hierarchical latent variable encoder-decoder model for gen- erating dialogues. In AAAI, 3295–3301. [Sordoni et al. 2015] Sordoni, A.; Galley, M.; Auli, M.; Brockett, C.; Ji, Y .; Mitchell, M.; Nie, J.-Y .; Gao, J.; and Dolan, B

work page 2015
[24]

A Neural Network Approach to Context-Sensitive Generation of Conversational Responses

A neural network approach to context- sensitive generation of conversational responses. arXiv preprint arXiv:1506.06714. [Vinyals and Le 2015] Vinyals, O., and Le, Q

work page internal anchor Pith review Pith/arXiv arXiv 2015
[25]

A Neural Conversational Model

A neural conversational model. arXiv preprint arXiv:1506.05869. [Wu et al. 2016] Wu, Y .; Wu, W.; Xing, C.; Zhou, M.; and Li, Z

work page internal anchor Pith review Pith/arXiv arXiv 2016
[26]

Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots

Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. arXiv preprint arXiv:1612.01627. [Young et al. 2017] Young, T.; Cambria, E.; Chaturvedi, I.; Huang, M.; Zhou, H.; and Biswas, S

work page internal anchor Pith review Pith/arXiv arXiv 2017
[27]

Augmenting End-to-End Dialog Systems with Commonsense Knowledge

Augmenting end-to-end dialog systems with commonsense knowledge. arXiv preprint arXiv:1709.05453

work page internal anchor Pith review Pith/arXiv arXiv

[1] [1]

2016] Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G

[Abadi et al. 2016] Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G. S.; Davis, A.; Dean, J.; Devin, M.; et al

work page 2016

[2] [2]

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Tensorﬂow: Large-scale ma- chine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467. [Bartl and Spanakis 2017] Bartl, A., and Spanakis, G

work page internal anchor Pith review Pith/arXiv arXiv 2017

[3] [3]

In Machine Learning and Applica- tions (ICMLA), 2017 16th IEEE International Conference on , 1120–1125

A retrieval-based dialogue system utilizing utterance and context embeddings. In Machine Learning and Applica- tions (ICMLA), 2017 16th IEEE International Conference on , 1120–1125. IEEE. [Bordes, Boureau, and Weston 2016] Bordes, A.; Boureau, Y .-L.; and Weston, J

work page 2017

[4] [4]

Learning End-to-End Goal-Oriented Dialog

Learning end-to-end goal- oriented dialog. arXiv preprint arXiv:1605.07683. [Chen et al. 2017] Chen, Q.; Zhu, X.; Ling, Z.-H.; Wei, S.; Jiang, H.; and Inkpen, D

work page internal anchor Pith review Pith/arXiv arXiv 2017

[5] [5]

In Proceedings of the 55th Annual Meet- ing of the Association for Computational Linguistics (V olume 1: Long Papers), volume 1, 1657–1668

Enhanced lstm for natural language inference. In Proceedings of the 55th Annual Meet- ing of the Association for Computational Linguistics (V olume 1: Long Papers), volume 1, 1657–1668. [Dong and Huang 2018] Dong, J., and Huang, J

work page 2018

[6] [6]

Enhance word representation for out-of-vocabulary on Ubuntu dialogue corpus

En- hance word representation for out-of-vocabulary on ubuntu dialogue corpus. arXiv preprint arXiv:1802.02614. [dos Santos et al. 2015] dos Santos, C.; Guimaraes, V .; Niter´oi, R.; and de Janeiro, R

work page internal anchor Pith review Pith/arXiv arXiv 2015

[7] [7]

In Proceed- ings of NEWS 2015 The Fifth Named Entities Workshop ,

Boosting named entity recognition with neural character embeddings. In Proceed- ings of NEWS 2015 The Fifth Named Entities Workshop ,

work page 2015

[8] [8]

[Eric and Manning 2017] Eric, M., and Manning, C. D

work page 2017

[9] [9]

Key-Value Retrieval Networks for Task-Oriented Dialogue

Key-value retrieval networks for task-oriented dialogue. arXiv preprint arXiv:1705.05414. [Ghazvininejad et al. 2017] Ghazvininejad, M.; Brockett, C.; Chang, M.-W.; Dolan, B.; Gao, J.; Yih, W.-t.; and Galley, M

work page internal anchor Pith review Pith/arXiv arXiv 2017

[10] [10]

A Knowledge-Grounded Neural Conversation Model

A knowledge-grounded neural conversation model. arXiv preprint arXiv:1702.01932. [Hochreiter and Schmidhuber 1997] Hochreiter, S., and Schmidhuber, J

work page internal anchor Pith review Pith/arXiv arXiv 1997

[11] [11]

Neural computation 9(8):1735–1780

Long short-term memory. Neural computation 9(8):1735–1780. [Kadlec, Schmid, and Kleindienst 2015] Kadlec, R.; Schmid, M.; and Kleindienst, J

work page 2015

[12] [12]

Improved Deep Learning Baselines for Ubuntu Corpus Dialogs

Improved deep learn- ing baselines for ubuntu corpus dialogs. arXiv preprint arXiv:1510.03753. [Kingma and Ba 2014] Kingma, D. P., and Ba, J

work page internal anchor Pith review Pith/arXiv arXiv 2014

[13] [13]

Adam: A Method for Stochastic Optimization

Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. [Krizhevsky, Sutskever, and Hinton 2012] Krizhevsky, A.; Sutskever, I.; and Hinton, G. E

work page internal anchor Pith review Pith/arXiv arXiv 2012

[14] [14]

In Advances in neural information processing systems, 1097–1105

Imagenet classiﬁcation with deep convolutional neural networks. In Advances in neural information processing systems, 1097–1105. [Kummerfeld et al. 2018] Kummerfeld, J. K.; Gouravajhala, S. R.; Peper, J.; Athreya, V .; Gunasekara, C.; Ganhotra, J.; Patel, S. S.; Polymenakos, L.; and Lasecki, W. S

work page 2018

[15] [15]

arXiv preprint arXiv:1810.11118

Ana- lyzing assumptions in conversation disentanglement research through the lens of a new dataset and model. arXiv preprint arXiv:1810.11118. [Le, Dymetman, and Renders 2016] Le, P.; Dymetman, M.; and Renders, J.-M

work page arXiv 2016

[16] [16]

LSTM-based Mixture-of-Experts for Knowledge-Aware Dialogues

Lstm-based mixture-of- experts for knowledge-aware dialogues. arXiv preprint arXiv:1605.01652. [Li et al. 2016] Li, J.; Galley, M.; Brockett, C.; Gao, J.; and Dolan, B

work page internal anchor Pith review Pith/arXiv arXiv 2016

[17] [17]

The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems

A diversity-promoting objective function for neural conversation models. In Proceedings of the 2016 Con- ference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 110–119. [Lowe et al. 2015a] Lowe, R.; Pow, N.; Serban, I.; Charlin, L.; and Pineau, J. 2015a. Incorporating unstructured textual knowl...

work page internal anchor Pith review Pith/arXiv arXiv 2016

[18] [18]

Efficient Estimation of Word Representations in Vector Space

Efﬁcient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. [Pandey et al. 2018] Pandey, G.; Contractor, D.; Kumar, V .; and Joshi, S

work page internal anchor Pith review Pith/arXiv arXiv 2018

[19] [19]

In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (V olume 1: Long Papers), volume 1, 1329–1338

Exemplar encoder-decoder for neural conversation generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (V olume 1: Long Papers), volume 1, 1329–1338. [Pennington, Socher, and Manning 2014] Pennington, J.; Socher, R.; and Manning, C

work page 2014

[20] [20]

In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 1532–1543

Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 1532–1543. [Seo et al. 2016] Seo, M.; Min, S.; Farhadi, A.; and Hajishirzi, H

work page 2014

[21] [21]

Query-Reduction Networks for Question Answering

Query-reduction networks for question answering. arXiv preprint arXiv:1606.04582. [Serban et al. 2016] Serban, I. V .; Sordoni, A.; Bengio, Y .; Courville, A. C.; and Pineau, J

work page internal anchor Pith review Pith/arXiv arXiv 2016

[22] [22]

In AAAI, volume 16, 3776–3784

Building end-to- end dialogue systems using generative hierarchical neural network models. In AAAI, volume 16, 3776–3784. [Serban et al. 2017] Serban, I. V .; Sordoni, A.; Lowe, R.; Charlin, L.; Pineau, J.; Courville, A. C.; and Bengio, Y

work page 2017

[23] [23]

In AAAI, 3295–3301

A hierarchical latent variable encoder-decoder model for gen- erating dialogues. In AAAI, 3295–3301. [Sordoni et al. 2015] Sordoni, A.; Galley, M.; Auli, M.; Brockett, C.; Ji, Y .; Mitchell, M.; Nie, J.-Y .; Gao, J.; and Dolan, B

work page 2015

[24] [24]

A Neural Network Approach to Context-Sensitive Generation of Conversational Responses

A neural network approach to context- sensitive generation of conversational responses. arXiv preprint arXiv:1506.06714. [Vinyals and Le 2015] Vinyals, O., and Le, Q

work page internal anchor Pith review Pith/arXiv arXiv 2015

[25] [25]

A Neural Conversational Model

A neural conversational model. arXiv preprint arXiv:1506.05869. [Wu et al. 2016] Wu, Y .; Wu, W.; Xing, C.; Zhou, M.; and Li, Z

work page internal anchor Pith review Pith/arXiv arXiv 2016

[26] [26]

Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots

Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. arXiv preprint arXiv:1612.01627. [Young et al. 2017] Young, T.; Cambria, E.; Chaturvedi, I.; Huang, M.; Zhou, H.; and Biswas, S

work page internal anchor Pith review Pith/arXiv arXiv 2017

[27] [27]

Augmenting End-to-End Dialog Systems with Commonsense Knowledge

Augmenting end-to-end dialog systems with commonsense knowledge. arXiv preprint arXiv:1709.05453

work page internal anchor Pith review Pith/arXiv arXiv