arxiv: 1602.06023 · v5 · pith:JA3OAAC3new · submitted 2016-02-19 · 💻 cs.CL

Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond

Ramesh Nallapati , Bowen Zhou , Cicero Nogueira dos Santos , Caglar Gulcehre , Bing Xiang This is my paper

classification 💻 cs.CL

keywords performancesummarizationabstractivefurthermodelsproposetextwork

0 comments

read the original abstract

In this work, we model abstractive text summarization using Attentional Encoder-Decoder Recurrent Neural Networks, and show that they achieve state-of-the-art performance on two different corpora. We propose several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling key-words, capturing the hierarchy of sentence-to-word structure, and emitting words that are rare or unseen at training time. Our work shows that many of our proposed models contribute to further improvement in performance. We also propose a new dataset consisting of multi-sentence summaries, and establish performance benchmarks for further research.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 8 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

The Partial Testimony of Logs: Evaluation of Language Model Generation under Confounded Model Choice
cs.LG 2026-05 unverdicted novelty 7.0

An identification theorem shows that a randomized experiment and simulator together recover causal model values from confounded logs, with logs used only afterward to reduce estimation error.
Whose Story Gets Told? Positionality and Bias in LLM Summaries of Life Narratives
cs.CL 2026-04 unverdicted novelty 6.0

A proposed pipeline shows LLMs introduce detectable race and gender biases when summarizing life narratives, creating potential for representational harm in research.
Reasoning Structure Matters for Safety Alignment of Reasoning Models
cs.AI 2026-04 unverdicted novelty 6.0

Changing the internal reasoning structure of large reasoning models through simple supervised fine-tuning on 1K examples produces strong safety alignment that generalizes across tasks and languages.
Learning to Control Summaries with Score Ranking
cs.CL 2026-04 unverdicted novelty 6.0

A score-ranking loss enables controllable summarization by aligning outputs to evaluation scores, matching SOTA performance with dimension-specific control on LLaMA, Qwen, and Mistral.
When LLMs get significantly worse: A statistical approach to detect model degradations
stat.ML 2026-02 conditional novelty 6.0

A McNemar-based statistical test detects real degradations in optimized LLMs with controlled false positives, even for accuracy changes as small as 0.3%.
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
cs.LG 2023-06 unverdicted novelty 6.0

H2O evicts non-heavy-hitter tokens from the KV cache using a dynamic submodular policy, retaining recent and frequent-co-occurrence tokens to reduce memory while preserving accuracy.
CTRL: A Conditional Transformer Language Model for Controllable Generation
cs.CL 2019-09 unverdicted novelty 6.0

CTRL is a large conditional transformer language model that uses naturally occurring control codes to steer text generation style and content.
Large Language Models: A Survey
cs.CL 2024-02 accept novelty 3.0

The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.