Story Ending Prediction by Transferable BERT

Ting Liu; Xiao Ding; Zhongyang Li

arxiv: 1905.07504 · v2 · pith:LTYYWV7Hnew · submitted 2019-05-17 · 💻 cs.CL · cs.LG

Story Ending Prediction by Transferable BERT

Zhongyang Li , Xiao Ding , Ting Liu This is my paper

classification 💻 cs.CL cs.LG

keywords bertpredictiontasksendingknowledgelanguagemodelstory

0 comments

read the original abstract

Recent advances, such as GPT and BERT, have shown success in incorporating a pre-trained transformer language model and fine-tuning operation to improve downstream NLP systems. However, this framework still has some fundamental problems in effectively incorporating supervised knowledge from other related tasks. In this study, we investigate a transferable BERT (TransBERT) training framework, which can transfer not only general language knowledge from large-scale unlabeled data but also specific kinds of knowledge from various semantically related supervised tasks, for a target task. Particularly, we propose utilizing three kinds of transfer tasks, including natural language inference, sentiment classification, and next action prediction, to further train BERT based on a pre-trained model. This enables the model to get a better initialization for the target task. We take story ending prediction as the target task to conduct experiments. The final result, an accuracy of 91.8%, dramatically outperforms previous state-of-the-art baseline methods. Several comparative experiments give some helpful suggestions on how to select transfer tasks. Error analysis shows what are the strength and weakness of BERT-based models for story ending prediction.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Language Models are Few-Shot Learners
cs.CL 2020-05 accept novelty 8.0

GPT-3 shows that scaling an autoregressive language model to 175 billion parameters enables strong few-shot performance across diverse NLP tasks via in-context prompting without fine-tuning.
Patent Claim Generation by Fine-Tuning OpenAI GPT-2
cs.CL 2019-07 unverdicted novelty 5.0

Fine-tunes GPT-2 on patent claims, probes training steps, analyzes conditional and unconditional sampling outputs, proposes a new sampling method, and releases an email bot for exploration.