Efficient Adaptation of Pretrained Transformers for Abstractive Summarization

Andrew Hoang; Antoine Bosselut; Asli Celikyilmaz; Yejin Choi

arxiv: 1906.00138 · v1 · pith:4MZXIVOLnew · submitted 2019-06-01 · 💻 cs.CL

Efficient Adaptation of Pretrained Transformers for Abstractive Summarization

Andrew Hoang , Antoine Bosselut , Asli Celikyilmaz , Yejin Choi This is my paper

classification 💻 cs.CL

keywords abstractiveimprovementslanguagesummarizationdatasetslessmodelsperformance

0 comments

read the original abstract

Large-scale learning of transformer language models has yielded improvements on a variety of natural language understanding tasks. Whether they can be effectively adapted for summarization, however, has been less explored, as the learned representations are less seamlessly integrated into existing neural text production architectures. In this work, we propose two solutions for efficiently adapting pretrained transformer language models as text summarizers: source embeddings and domain-adaptive training. We test these solutions on three abstractive summarization datasets, achieving new state of the art performance on two of them. Finally, we show that these improvements are achieved by producing more focused summaries with fewer superfluous and that performance improvements are more pronounced on more abstractive datasets.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
cs.CL 2023-11 conditional novelty 5.0

Continued pretraining of Llama-2 on medical data yields MEDITRON-70B, which outperforms GPT-3.5 and Med-PaLM while approaching GPT-4 performance on medical benchmarks.