Implicit Discourse Relation Identification for Open-domain Dialogues

Jiaqi Wu; Kevin K. Bowden; Marilyn Walker; Mingyu Derek Ma; Wen Cui

arxiv: 1907.03975 · v1 · pith:VK7F2RIWnew · submitted 2019-07-09 · 💻 cs.CL

Implicit Discourse Relation Identification for Open-domain Dialogues

Mingyu Derek Ma , Kevin K. Bowden , Jiaqi Wu , Wen Cui , Marilyn Walker This is my paper

Pith reviewed 2026-05-25 00:52 UTC · model grok-4.3

classification 💻 cs.CL

keywords implicit discourse relationsopen-domain dialoguesdiscourse parsingdialogue systemsautomatic extractionfeature ablationcorpus creation

0 comments

The pith

A pipeline automatically extracts implicit discourse relation pairs from open-domain dialogues and uses dialogue features to improve identification models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a new approach to identifying implicit discourse relations specifically for open-domain dialogues rather than formal written text. It does this by designing a pipeline that automatically extracts relation argument pairs and labels from sequences of dialogic turns to build a dedicated corpus. The work then incorporates unique dialogue features into existing models and uses ablation to demonstrate gains. A sympathetic reader would care because current systems struggle with the informal and topic-shifting nature of real conversations, and this could make dialogue agents more coherent.

Core claim

We designed a novel discourse relation identification pipeline specifically tuned for open-domain dialogue systems. We firstly propose a method to automatically extract the implicit discourse relation argument pairs and labels from a dataset of dialogic turns, resulting in a novel corpus of discourse relation pairs; the first of its kind to attempt to identify the discourse relations connecting the dialogic turns in open-domain discourse. Moreover, we have taken the first steps to leverage the dialogue features unique to our task to further improve the identification of such relations by performing feature ablation and incorporating dialogue features to enhance the state-of-the-art model.

What carries the argument

The novel discourse relation identification pipeline tuned for open-domain dialogue systems, which includes automatic extraction of argument pairs from dialogic turns.

If this is right

The resulting corpus enables training models on dialogic rather than formal text data.
Incorporating dialogue features improves performance on identifying relations in conversations.
Feature ablation reveals which dialogue aspects contribute most to the task.
This addresses the unsuitability of news and journal corpora for dialogue nuances and topics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Future dialogue systems could use this method to maintain better discourse coherence across turns.
The extraction technique might apply to other conversational datasets to create more resources.
Improved relation identification could help in tasks like response generation that require understanding connections between utterances.

Load-bearing premise

The automatic extraction method from dialogic turns produces sufficiently accurate implicit discourse relation labels and pairs without substantial noise or mislabeling that would invalidate downstream model improvements.

What would settle it

A manual review of a sample of extracted pairs showing low agreement with the assigned labels would indicate the method does not produce reliable training data.

read the original abstract

Discourse relation identification has been an active area of research for many years, and the challenge of identifying implicit relations remains largely an unsolved task, especially in the context of an open-domain dialogue system. Previous work primarily relies on a corpora of formal text which is inherently non-dialogic, i.e., news and journals. This data however is not suitable to handle the nuances of informal dialogue nor is it capable of navigating the plethora of valid topics present in open-domain dialogue. In this paper, we designed a novel discourse relation identification pipeline specifically tuned for open-domain dialogue systems. We firstly propose a method to automatically extract the implicit discourse relation argument pairs and labels from a dataset of dialogic turns, resulting in a novel corpus of discourse relation pairs; the first of its kind to attempt to identify the discourse relations connecting the dialogic turns in open-domain discourse. Moreover, we have taken the first steps to leverage the dialogue features unique to our task to further improve the identification of such relations by performing feature ablation and incorporating dialogue features to enhance the state-of-the-art model.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper creates the first corpus for implicit discourse relations in open-domain dialogue via automatic extraction, but without any validation of those labels the feature ablation results are hard to trust.

read the letter

The paper's main contribution is a new corpus for implicit discourse relations in open-domain dialogues, built by automatically extracting argument pairs and labels from dialogic turns. They follow this with feature ablation that adds dialogue-specific signals to boost an existing model. This is new relative to the PDTB-style corpora that come from news and journals. The authors correctly note that those sources do not match the informal, open-topic style of dialogue systems. The pipeline they outline for pulling relations from turns is a straightforward way to start filling that gap, and the ablation on dialogue features is a logical next step. The work does a decent job of framing why formal text corpora fall short for conversational AI. The claim that this is the first such dialogue corpus seems accurate based on the references they cite. The main weakness is the lack of any check on the automatic extraction. The abstract describes the method but gives no precision, recall, or sample review against known gold labels. Without that, it's impossible to know if the corpus has too much noise for the ablation results to mean anything. If the labels are off, then adding dialogue features might just be fitting to the noise rather than real signal. The abstract also omits all quantitative results, so we can't see the actual effect sizes. This paper is for researchers working on discourse parsing in dialogue or building better conversational agents. Someone looking for new datasets in this area could get value from the extraction approach, though they would need to verify the labels themselves. I would bring this to a reading group focused on dialogue NLP to discuss the corpus construction. I would not cite it yet because the validation is missing. It deserves a serious referee because the resource idea is worth developing, even if the current version has this clear gap in evidence.

Referee Report

2 major / 0 minor

Summary. The paper claims to introduce a novel pipeline for implicit discourse relation identification tailored to open-domain dialogues. It describes an automatic method to extract implicit relation argument pairs and labels from dialogic turns, yielding the first such corpus, followed by feature ablation and incorporation of dialogue-specific features to improve a state-of-the-art model.

Significance. If the extraction method yields a sufficiently clean corpus and the reported improvements hold under validation, the work would address a clear gap by moving beyond formal-text corpora (e.g., PDTB-style news) to informal dialogue, potentially benefiting dialogue systems. The emphasis on dialogue-unique features is a constructive direction. At present, however, the absence of any quantitative results, label validation, or error analysis prevents assessment of whether the claimed gains are substantive.

major comments (2)

[Abstract] Abstract: the description of the pipeline and feature addition supplies no quantitative results, validation of extracted labels, or error analysis; the central claim of improvement therefore cannot be assessed from available text.
[Extraction method (as described in Abstract)] Extraction pipeline: the automatic extraction method from dialogic turns produces the corpus on which all subsequent ablation and SOTA-enhancement claims rest, yet no precision/recall figures, inter-annotator agreement, or sample analysis against PDTB-style gold labels are reported, leaving open the possibility that observed gains are artifacts of label noise rather than genuine discourse signal.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments correctly identify that quantitative results and validation details are needed to support the claims. We address each point below and will make the corresponding revisions.

read point-by-point responses

Referee: [Abstract] Abstract: the description of the pipeline and feature addition supplies no quantitative results, validation of extracted labels, or error analysis; the central claim of improvement therefore cannot be assessed from available text.

Authors: We agree that the abstract provides no quantitative results. The full manuscript reports experimental results on the extracted corpus, feature ablations, and improvements to the state-of-the-art model. We will revise the abstract to summarize key metrics including corpus size, baseline performance, and the gains from dialogue features. revision: yes
Referee: [Extraction method (as described in Abstract)] Extraction pipeline: the automatic extraction method from dialogic turns produces the corpus on which all subsequent ablation and SOTA-enhancement claims rest, yet no precision/recall figures, inter-annotator agreement, or sample analysis against PDTB-style gold labels are reported, leaving open the possibility that observed gains are artifacts of label noise rather than genuine discourse signal.

Authors: We acknowledge that the current manuscript does not report precision/recall, IAA, or error analysis for the extraction pipeline. Because no PDTB-style gold standard exists for open-domain dialogues, direct comparison is not possible. We will add a validation subsection with manual sampling, inter-annotator agreement on extracted pairs, and qualitative error analysis to demonstrate corpus quality and rule out label noise as the source of gains. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation relies on external data and ablation testing

full rationale

The paper constructs a new corpus by automatic extraction from an external dialogue dataset and evaluates improvements via feature ablation on an enhanced SOTA model. No equations, fitted parameters renamed as predictions, self-definitional steps, or load-bearing self-citations appear in the provided text. The central claims rest on empirical ablation results against external benchmarks rather than reducing to the inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no information on free parameters, axioms, or invented entities; extraction method and model details are not described.

pith-pipeline@v0.9.0 · 5719 in / 1018 out tokens · 18429 ms · 2026-05-25T00:52:26.937841+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages · 4 internal anchors

[1]

arXiv preprint arXiv:1709.05411

Combining search with structured data to create a more engag- ing user experience in open domain dialogue. arXiv preprint arXiv:1709.05411. Kevin K Bowden, Jiaqi Wu, Wen Cui, Juraj Juraska, Vrindavan Harrison, Brian Schwarzmann, Nick San- ter, and Marilyn Walker. Slugbot: Developing a computational model and framework of a novel dia- logue genre. Kevin K ...

work page arXiv
[2]

In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: T echnical Papers , pages 1694–1705

Combining natu- ral and artiﬁcial examples to improve implicit dis- course relation identiﬁcation. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: T echnical Papers , pages 1694–1705. Zeyu Dai and Ruihong Huang

work page 2014
[3]

Improving im- plicit discourse relation classiﬁcation by modeling inter-dependencies of discourse units in a paragraph. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computa- tional Linguistics: Human Language T echnologies, V olume 1 (Long Papers), volume 1, pages 141–151. Joachim Fainberg, Ben Krause, Mihai D...

work page 2018
[4]

Talking to myself: self-dialogues as data for conversational agents

Talk- ing to myself: self-dialogues as data for conversa- tional agents. arXiv preprint arXiv:1809.06641 . Eric N Forsyth and Craig H Martell

work page internal anchor Pith review Pith/arXiv arXiv
[5]

In Inter- national Conference on Semantic Computing (ICSC 2007), pages 19–26

Lexical and discourse analysis of online chat dialog. In Inter- national Conference on Semantic Computing (ICSC 2007), pages 19–26. IEEE. Fengyu Guo, Ruifang He, Di Jin, Jianwu Dang, Long- biao Wang, and Xiangang Li

work page 2007
[6]

Advancing the State of the Art in Open Domain Dialog Systems through the Alexa Prize

Advancing the state of the art in open domain dia- log systems through the alexa prize. arXiv preprint arXiv:1812.10757. Ben Krause, Marco Damonte, Mihai Dobre, Daniel Duma, Joachim Fainberg, Federico Fancellu, Em- manuel Kahembwe, Jianpeng Cheng, and Bon- nie Webber

work page internal anchor Pith review Pith/arXiv arXiv
[7]

Edina: Building an Open Domain Socialbot with Self-dialogues

Edina: Building an open do- main socialbot with self-dialogues. arXiv preprint arXiv:1709.09816. Junyi Jessy Li and Ani Nenkova

work page internal anchor Pith review Pith/arXiv arXiv
[8]

In Proceedings of the 2009 Con- ference on Empirical Methods in Natural Language Processing: V olume 1-V olume 1 , pages 343–351

Recognizing implicit discourse relations in the penn discourse treebank. In Proceedings of the 2009 Con- ference on Empirical Methods in Natural Language Processing: V olume 1-V olume 1 , pages 343–351. Association for Computational Linguistics. Christopher Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven Bethard, and David McClosky

work page 2009
[9]

In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2263–2270

A stacking gated neural architecture for implicit dis- course relation classiﬁcation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2263–2270. Ashwin Ram, Rohit Prasad, Chandra Khatri, Anu V enkatesh, Raefer Gabriel, Qing Liu, Jeff Nunn, Behnam Hedayatnia, Ming Cheng, Ashish Nagar, et al

work page 2016
[10]

Conversational AI: The Science Behind the Alexa Prize

Conversational ai: The science behind the alexa prize. arXiv preprint arXiv:1801.03604 . Attapol Rutherford and Nianwen Xue

work page internal anchor Pith review Pith/arXiv arXiv
[11]

In Pro- ceedings of the 2015 Conference of the North Amer- ican Chapter of the Association for Computational Linguistics: Human Language T echnologies , pages 799–808

Improv- ing the inference of implicit discourse relations via classifying explicit discourse connectives. In Pro- ceedings of the 2015 Conference of the North Amer- ican Chapter of the Association for Computational Linguistics: Human Language T echnologies , pages 799–808. Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D Manning, Andre...

work page 2015
[12]

In Proceedings of the 2013 conference on empirical methods in natural language processing , pages 1631–1642

Recursive deep models for semantic compositionality over a sentiment tree- bank. In Proceedings of the 2013 conference on empirical methods in natural language processing , pages 1631–1642. Caroline Sporleder and Alex Lascarides

work page 2013
[13]

In INLG’2000 Proceedings of the First International Conference on Natural Language Generation

Rhetorical structure in dialog. In INLG’2000 Proceedings of the First International Conference on Natural Language Generation . Sara Tonelli, Giuseppe Riccardi, Rashmi Prasad, and Aravind K Joshi

work page 2000
[14]

Proceedings of COLING 2012, pages 2757–2772

Implicit discourse relation recognition by selecting typical training examples. Proceedings of COLING 2012, pages 2757–2772. Ben Wellner, James Pustejovsky, Catherine Havasi, Anna Rumshisky, and Roser Sauri

work page 2012
[15]

In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 725–731

Using active learning to expand training data for implicit dis- course relation recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 725–731

work page 2018

[1] [1]

arXiv preprint arXiv:1709.05411

Combining search with structured data to create a more engag- ing user experience in open domain dialogue. arXiv preprint arXiv:1709.05411. Kevin K Bowden, Jiaqi Wu, Wen Cui, Juraj Juraska, Vrindavan Harrison, Brian Schwarzmann, Nick San- ter, and Marilyn Walker. Slugbot: Developing a computational model and framework of a novel dia- logue genre. Kevin K ...

work page arXiv

[2] [2]

In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: T echnical Papers , pages 1694–1705

Combining natu- ral and artiﬁcial examples to improve implicit dis- course relation identiﬁcation. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: T echnical Papers , pages 1694–1705. Zeyu Dai and Ruihong Huang

work page 2014

[3] [3]

Improving im- plicit discourse relation classiﬁcation by modeling inter-dependencies of discourse units in a paragraph. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computa- tional Linguistics: Human Language T echnologies, V olume 1 (Long Papers), volume 1, pages 141–151. Joachim Fainberg, Ben Krause, Mihai D...

work page 2018

[4] [4]

Talking to myself: self-dialogues as data for conversational agents

Talk- ing to myself: self-dialogues as data for conversa- tional agents. arXiv preprint arXiv:1809.06641 . Eric N Forsyth and Craig H Martell

work page internal anchor Pith review Pith/arXiv arXiv

[5] [5]

In Inter- national Conference on Semantic Computing (ICSC 2007), pages 19–26

Lexical and discourse analysis of online chat dialog. In Inter- national Conference on Semantic Computing (ICSC 2007), pages 19–26. IEEE. Fengyu Guo, Ruifang He, Di Jin, Jianwu Dang, Long- biao Wang, and Xiangang Li

work page 2007

[6] [6]

Advancing the State of the Art in Open Domain Dialog Systems through the Alexa Prize

Advancing the state of the art in open domain dia- log systems through the alexa prize. arXiv preprint arXiv:1812.10757. Ben Krause, Marco Damonte, Mihai Dobre, Daniel Duma, Joachim Fainberg, Federico Fancellu, Em- manuel Kahembwe, Jianpeng Cheng, and Bon- nie Webber

work page internal anchor Pith review Pith/arXiv arXiv

[7] [7]

Edina: Building an Open Domain Socialbot with Self-dialogues

Edina: Building an open do- main socialbot with self-dialogues. arXiv preprint arXiv:1709.09816. Junyi Jessy Li and Ani Nenkova

work page internal anchor Pith review Pith/arXiv arXiv

[8] [8]

In Proceedings of the 2009 Con- ference on Empirical Methods in Natural Language Processing: V olume 1-V olume 1 , pages 343–351

Recognizing implicit discourse relations in the penn discourse treebank. In Proceedings of the 2009 Con- ference on Empirical Methods in Natural Language Processing: V olume 1-V olume 1 , pages 343–351. Association for Computational Linguistics. Christopher Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven Bethard, and David McClosky

work page 2009

[9] [9]

In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2263–2270

A stacking gated neural architecture for implicit dis- course relation classiﬁcation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2263–2270. Ashwin Ram, Rohit Prasad, Chandra Khatri, Anu V enkatesh, Raefer Gabriel, Qing Liu, Jeff Nunn, Behnam Hedayatnia, Ming Cheng, Ashish Nagar, et al

work page 2016

[10] [10]

Conversational AI: The Science Behind the Alexa Prize

Conversational ai: The science behind the alexa prize. arXiv preprint arXiv:1801.03604 . Attapol Rutherford and Nianwen Xue

work page internal anchor Pith review Pith/arXiv arXiv

[11] [11]

In Pro- ceedings of the 2015 Conference of the North Amer- ican Chapter of the Association for Computational Linguistics: Human Language T echnologies , pages 799–808

Improv- ing the inference of implicit discourse relations via classifying explicit discourse connectives. In Pro- ceedings of the 2015 Conference of the North Amer- ican Chapter of the Association for Computational Linguistics: Human Language T echnologies , pages 799–808. Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D Manning, Andre...

work page 2015

[12] [12]

In Proceedings of the 2013 conference on empirical methods in natural language processing , pages 1631–1642

Recursive deep models for semantic compositionality over a sentiment tree- bank. In Proceedings of the 2013 conference on empirical methods in natural language processing , pages 1631–1642. Caroline Sporleder and Alex Lascarides

work page 2013

[13] [13]

In INLG’2000 Proceedings of the First International Conference on Natural Language Generation

Rhetorical structure in dialog. In INLG’2000 Proceedings of the First International Conference on Natural Language Generation . Sara Tonelli, Giuseppe Riccardi, Rashmi Prasad, and Aravind K Joshi

work page 2000

[14] [14]

Proceedings of COLING 2012, pages 2757–2772

Implicit discourse relation recognition by selecting typical training examples. Proceedings of COLING 2012, pages 2757–2772. Ben Wellner, James Pustejovsky, Catherine Havasi, Anna Rumshisky, and Roser Sauri

work page 2012

[15] [15]

In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 725–731

Using active learning to expand training data for implicit dis- course relation recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 725–731

work page 2018