WriterForcing: Generating more interesting story endings
Pith reviewed 2026-05-24 19:36 UTC · model grok-4.3
The pith
Seq2Seq models trained to focus on story keyphrases and non-generic words produce more diverse endings.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Training models to focus attention on important keyphrases of the story and promoting generation of non-generic words leads to more diverse and interesting story endings.
What carries the argument
WriterForcing, a training procedure that combines keyphrase-guided attention with explicit promotion of non-generic words inside a sequence-to-sequence generator.
If this is right
- The two modifications together increase measured diversity of generated endings relative to unmodified seq2seq training.
- The same combination increases human ratings of ending interestingness.
- Keyphrase attention alone helps the model stay grounded in the supplied story context.
- Penalizing generic words alone reduces the tendency toward dull, high-probability continuations.
Where Pith is reading between the lines
- The same pair of training signals could be tested on other conditional generation tasks such as dialogue response or news continuation.
- If the gains hold, post-hoc diversity techniques such as nucleus sampling might become less necessary.
- The approach leaves open whether similar gains appear when the input is longer or drawn from different genres.
Load-bearing premise
Directing attention to keyphrases and discouraging generic words will improve human judgments of interest and diversity without harming coherence or overall quality.
What would settle it
A controlled human evaluation in which the modified models receive equal or lower scores than standard seq2seq models on interest, diversity, or coherence.
read the original abstract
We study the problem of generating interesting endings for stories. Neural generative models have shown promising results for various text generation problems. Sequence to Sequence (Seq2Seq) models are typically trained to generate a single output sequence for a given input sequence. However, in the context of a story, multiple endings are possible. Seq2Seq models tend to ignore the context and generate generic and dull responses. Very few works have studied generating diverse and interesting story endings for a given story context. In this paper, we propose models which generate more diverse and interesting outputs by 1) training models to focus attention on important keyphrases of the story, and 2) promoting generation of non-generic words. We show that the combination of the two leads to more diverse and interesting endings.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes WriterForcing, a training modification for Seq2Seq models that combines (1) forcing attention on important keyphrases from the story context and (2) promoting generation of non-generic words, with the central claim that this combination yields more diverse and interesting story endings than standard training.
Significance. If the empirical results hold, the method offers a lightweight way to mitigate the generic-response problem that is well-documented in neural story generation; the two components are simple to implement and could be adopted as a baseline or ablation target in future work on creative text generation.
minor comments (1)
- [Abstract] Abstract: the central claim is stated without any mention of datasets, metrics (e.g., diversity scores, human evaluation criteria), baselines, or quantitative results, which prevents assessment of whether the evidence supports the claim.
Simulated Author's Rebuttal
We thank the referee for reviewing our manuscript. The provided summary accurately captures the core idea of WriterForcing as a combination of keyphrase attention and non-generic word promotion to improve story ending diversity. The recommendation is listed as uncertain, but the report contains no enumerated major comments following the 'MAJOR COMMENTS:' heading. We therefore have no specific points to rebut point-by-point and stand ready to address any additional feedback the referee may wish to supply.
- No specific major comments were supplied in the referee report, preventing any point-by-point response.
Circularity Check
No significant circularity; empirical claims rest on external evaluation
full rationale
The paper proposes two training modifications (keyphrase attention focus and non-generic word promotion) for story-ending generation and claims their combination yields more diverse/interesting outputs. This is presented as an empirical result evaluated on the target task, with no equations, fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations visible in the provided text. The central claim does not reduce to its inputs by construction and is supported by human evaluation rather than internal redefinition.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Seq2Seq models can be trained with modified attention and loss functions to improve output diversity and interestingness.
Reference graph
Works this paper leans on
-
[1]
Neural Machine Translation by Jointly Learning to Align and Translate
Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. Ashutosh Baheti, Alan Ritter, Jiwei Li, and Bill Dolan
work page internal anchor Pith review Pith/arXiv arXiv
-
[2]
Generating more interesting responses in neural conversation models with distributional constraints. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Pro- cessing, pages 3970–3980, Brussels, Belgium. Asso- ciation for Computational Linguistics. J. S. Chen, Jiaao Chen, and Zhou Yu
work page 2018
-
[3]
Incorporating Structured Commonsense Knowledge in Story Completion
Incor- porating structured commonsense knowledge in story completion. CoRR, abs/1811.00625. Elizabeth Clark, Yangfeng Ji, and Noah A Smith
work page internal anchor Pith review Pith/arXiv arXiv
-
[4]
Neural text generation in stories using entity rep- resentations as context. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language T echnologies, Volume 1 (Long Papers) , pages 2250–2260. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina T outanova
work page 2018
-
[5]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Bert: Pre-training of deep bidirectional transformers for language un- derstanding. arXiv preprint arXiv:1810.04805. Carl Doersch
work page internal anchor Pith review Pith/arXiv arXiv
-
[6]
Tutorial on variational autoencoders,
Tutorial on variational autoen- coders. arXiv preprint arXiv:1606.05908. Angela Fan, Mike Lewis, and Yann Dauphin
-
[7]
Hierarchical Neural Story Generation
Hierarchical neural story generation. arXiv preprint arXiv:1805.04833. Jian Guan, Yansen Wang, and Minlie Huang
work page internal anchor Pith review Pith/arXiv arXiv
-
[9]
Story Ending Generation with Incremental Encoding and Commonsense Knowledge
Story ending generation with incremental en- coding and commonsense knowledge. CoRR, abs/1808.10113. Zhiting Hu, Zichao Yang, Xiaodan Liang, Ruslan Salakhutdinov , and Eric P . Xing. 2017a. T oward controlled generation of text. In Proceedings of the 34th International Conference on Machine Learn- ing, volume 70 of Proceedings of Machine Learning Research...
work page internal anchor Pith review Pith/arXiv arXiv
-
[10]
Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation
Hi- erarchically structured reinforcement learning for topically coherent visual story generation. arXiv preprint arXiv:1805.08191. Parag Jain, Priyanka Agrawal, Abhijit Mishra, Mohak Sukhwani, Anirban Laha, and Karthik Sankara- narayanan
work page internal anchor Pith review Pith/arXiv arXiv
-
[11]
Story Generation from Sequence of Independent Short Descriptions
Story generation from sequence of independent short descriptions. arXiv preprint arXiv:1707.05501. Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016a. A diversity-promoting ob- jective function for neural conversation models. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computa- tiona...
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[12]
Adversar- ial learning for neural dialogue generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing , pages 2157–2169, Copenhagen, Denmark. Association for Computational Linguistics. Zhongyang Li, Xiao Ding, and Ting Liu
work page 2017
-
[13]
Lsd- sem 2017 shared task: The story cloze test. In Pro- ceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics , pages 46–51. Ryo Nakamura, Katsuhito Sudoh, Koichiro Yoshino, and Satoshi Nakamura
work page 2017
-
[14]
Another Diversity-Promoting Objective Function for Neural Dialogue Generation
Another diversity- promoting objective function for neural dialogue generation. arXiv preprint arXiv:1811.08100. Nanyun Peng, Marjan Ghazvininejad, Jonathan May, and Kevin Knight
work page internal anchor Pith review Pith/arXiv arXiv
-
[15]
Gen- erating high-quality and informative conversation responses with sequence-to-sequence models. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing , pages 2210–2219, Copenhagen, Denmark. Association for Computational Linguistics. Ilya Sutskever, Oriol Vinyals, and Quoc V Le
work page 2017
-
[16]
Diverse beam search for improved description of com- plex scenes. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI- 18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Sympo- sium on Educational Advances in Artificial Intel- ligence (EAAI-18), New Orleans, Louisiana, USA, February ...
work page 2018
-
[17]
Topic Aware Neural Response Generation
T opic augmented neural response generation with a joint attention mechanism. arXiv preprint arXiv:1606.08340, 2(2). Jingjing Xu, Xuancheng Ren, Junyang Lin, and Xu Sun
work page internal anchor Pith review Pith/arXiv arXiv
-
[18]
Diversity-promoting gan: A cross- entropy based generative adversarial network for diversified text generation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3940–3949. Lili Yao, Nanyun Peng, Weischedel Ralph, Kevin Knight, Dongyan Zhao, and Rui Yan
work page 2018
-
[19]
Plan-And-Write: Towards Better Automatic Storytelling
Plan- and-write: T owards better automatic storytelling. arXiv preprint arXiv:1811.05701. Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, and Jason Weston
work page internal anchor Pith review Pith/arXiv arXiv
-
[20]
Personalizing Dialogue Agents: I have a dog, do you have pets too?
Per- sonalizing dialogue agents: I have a dog, do you have pets too? arXiv preprint arXiv:1801.07243. Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, and Bing Liu
work page internal anchor Pith review Pith/arXiv arXiv
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.