Hierarchical Neural Story Generation

Angela Fan , Mike Lewis , Yann Dauphin

Authors on Pith no claims yet

classification 💻 cs.CL

keywords modelstorygenerationdatasethierarchicalhumanimprovementslarge

read the original abstract

We explore story generation: creative systems that can build coherent and fluent passages of text about a topic. We collect a large dataset of 300K human-written stories paired with writing prompts from an online forum. Our dataset enables hierarchical story generation, where the model first generates a premise, and then transforms it into a passage of text. We gain further improvements with a novel form of model fusion that improves the relevance of the story to the prompt, and adding a new gated multi-scale self-attention mechanism to model long-range context. Experiments show large improvements over strong baselines on both automated and human evaluations. Human judges prefer stories generated by our approach to those from a strong non-hierarchical model by a factor of two to one.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 10 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Chronos: Learning the Language of Time Series
cs.LG 2024-03 conditional novelty 7.0

Chronos pretrains transformer models on tokenized time series to deliver strong zero-shot forecasting across diverse domains.
Annotations Mitigate Post-Training Mode Collapse
cs.CL 2026-05 unverdicted novelty 6.0

Annotation-anchored training reduces semantic diversity collapse in post-trained language models by a factor of six compared to standard supervised fine-tuning while preserving instruction-following and improving with scale.
APCD: Adaptive Path-Contrastive Decoding for Reliable Large Language Model Generation
cs.CL 2026-05 unverdicted novelty 6.0

APCD reduces LLM hallucinations by expanding decoding paths adaptively when entropy signals uncertainty and by contrasting divergent paths to control their interaction.
A Universal Avoidance Method for Diverse Multi-branch Generation
cs.CL 2026-04 unverdicted novelty 6.0

UAG is a universal avoidance generation method that increases multi-branch diversity in diffusion and transformer models by penalizing output similarity, delivering up to 1.9x higher diversity with 4.4x speed and 1/64...
Narrix: Remixing Narrative Strategies from Examples for Story Writing
cs.HC 2026-04 unverdicted novelty 6.0

Narrix helps novices identify and reuse narrative strategies from examples through visualization and strategy-steered generation, improving retention, confidence, and adaptation over chat interfaces in a 12-person study.
Do Linear Probes Generalize Better in Persona Coordinates?
cs.AI 2026-05 unverdicted novelty 5.0

Probes on persona principal components from contrastive prompts generalize better than raw activation probes for harmful behaviors across 10 datasets.
Exploring the Effectiveness of Abstract Syntax Tree Patterns for Algorithm Recognition
cs.SE 2026-05 unverdicted novelty 5.0

An AST pattern-matching prototype with a custom DSL achieves 0.74 average F1-score on a BigCloneEval subset, outperforming CodeLlama (0.35) and code clone detectors (best recall 0.20).
Combining Static Code Analysis and Large Language Models Improves Correctness and Performance of Algorithm Recognition
cs.SE 2026-04 conditional novelty 4.0

Hybrid LLM plus static analysis for algorithm recognition in code cuts required model calls by 72-97% and lifts F1-scores by as much as 12 points.
Skeleton-based Coherence Modeling in Narratives
cs.CL 2026-04 unverdicted novelty 4.0

Sentence-level models outperform skeleton-based approaches for narrative coherence despite a new SSN network improving on cosine and Euclidean baselines.
LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
cs.CL 2024-12 accept novelty 3.0

A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.