A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents

Arman Cohan; Doo Soon Kim; Franck Dernoncourt; Nazli Goharian; Seokhwan Kim; Trung Bui; Walter Chang

arxiv: 1804.05685 · v2 · pith:U4XAZVBHnew · submitted 2018-04-16 · 💻 cs.CL

A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents

Arman Cohan , Franck Dernoncourt , Doo Soon Kim , Trung Bui , Seokhwan Kim , Walter Chang , Nazli Goharian This is my paper

classification 💻 cs.CL

keywords abstractivedocumentsmodelmodelssummarizationdiscourse-awareresultsapproach

0 comments

read the original abstract

Neural abstractive summarization models have led to promising results in summarizing relatively short documents. We propose the first model for abstractive summarization of single, longer-form documents (e.g., research papers). Our approach consists of a new hierarchical encoder that models the discourse structure of a document, and an attentive discourse-aware decoder to generate the summary. Empirical results on two large-scale datasets of scientific papers show that our model significantly outperforms state-of-the-art models.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Uniform Diffusion Models Revisited: Leave-One-Out Denoiser and Absorbing State Reformulation
cs.LG 2026-05 unverdicted novelty 7.0

Uniform diffusion models rely on a leave-one-out denoiser rather than the usual denoising posterior, with exact conversions derived; an absorbing-state reformulation is introduced that matches or exceeds masked diffus...
Dynamic Chunking for Diffusion Language Models
cs.CL 2026-05 unverdicted novelty 7.0

DCDM replaces positional blocks with learnable semantic chunks via differentiable Chunking Attention, yielding consistent gains over block and unstructured diffusion baselines up to 1.5B parameters.
A Queueing-Theoretic Framework for Stability Analysis of LLM Inference with KV Cache Memory Constraints
cs.LG 2026-05 unverdicted novelty 6.0

A queueing model derives stability conditions for LLM inference services under combined compute and KV cache memory limits, with experimental validation showing typical deviations under 10%.
From Tokens to Layers: Redefining Stall-Free Scheduling for MoE Serving with Layered Prefill
cs.LG 2025-10 unverdicted novelty 6.0

Layered prefill replaces token-chunked prefill with layer-group interleaving in MoE models, cutting TTFT by up to 70%, end-to-end latency by 41%, and per-token energy by 22% while preserving stall-free TBT.
Enriching and Controlling Global Semantics for Text Summarization
cs.CL 2021-09 unverdicted novelty 5.0

A normalizing-flow neural topic model plus control mechanism are added to Transformer summarizers to supply and regulate global semantics, with reported gains over prior models on five benchmarks.