pith. machine review for the scientific record. sign in

arxiv: 1805.04833 · v1 · submitted 2018-05-13 · 💻 cs.CL

Recognition: unknown

Hierarchical Neural Story Generation

Authors on Pith no claims yet
classification 💻 cs.CL
keywords modelstorygenerationdatasethierarchicalhumanimprovementslarge
0
0 comments X
read the original abstract

We explore story generation: creative systems that can build coherent and fluent passages of text about a topic. We collect a large dataset of 300K human-written stories paired with writing prompts from an online forum. Our dataset enables hierarchical story generation, where the model first generates a premise, and then transforms it into a passage of text. We gain further improvements with a novel form of model fusion that improves the relevance of the story to the prompt, and adding a new gated multi-scale self-attention mechanism to model long-range context. Experiments show large improvements over strong baselines on both automated and human evaluations. Human judges prefer stories generated by our approach to those from a strong non-hierarchical model by a factor of two to one.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 10 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Chronos: Learning the Language of Time Series

    cs.LG 2024-03 conditional novelty 7.0

    Chronos pretrains transformer models on tokenized time series to deliver strong zero-shot forecasting across diverse domains.

  2. Annotations Mitigate Post-Training Mode Collapse

    cs.CL 2026-05 unverdicted novelty 6.0

    Annotation-anchored training reduces semantic diversity collapse in post-trained language models by a factor of six compared to standard supervised fine-tuning while preserving instruction-following and improving with scale.

  3. APCD: Adaptive Path-Contrastive Decoding for Reliable Large Language Model Generation

    cs.CL 2026-05 unverdicted novelty 6.0

    APCD reduces LLM hallucinations by expanding decoding paths adaptively when entropy signals uncertainty and by contrasting divergent paths to control their interaction.

  4. A Universal Avoidance Method for Diverse Multi-branch Generation

    cs.CL 2026-04 unverdicted novelty 6.0

    UAG is a universal avoidance generation method that increases multi-branch diversity in diffusion and transformer models by penalizing output similarity, delivering up to 1.9x higher diversity with 4.4x speed and 1/64...

  5. Narrix: Remixing Narrative Strategies from Examples for Story Writing

    cs.HC 2026-04 unverdicted novelty 6.0

    Narrix helps novices identify and reuse narrative strategies from examples through visualization and strategy-steered generation, improving retention, confidence, and adaptation over chat interfaces in a 12-person study.

  6. Do Linear Probes Generalize Better in Persona Coordinates?

    cs.AI 2026-05 unverdicted novelty 5.0

    Probes on persona principal components from contrastive prompts generalize better than raw activation probes for harmful behaviors across 10 datasets.

  7. Exploring the Effectiveness of Abstract Syntax Tree Patterns for Algorithm Recognition

    cs.SE 2026-05 unverdicted novelty 5.0

    An AST pattern-matching prototype with a custom DSL achieves 0.74 average F1-score on a BigCloneEval subset, outperforming CodeLlama (0.35) and code clone detectors (best recall 0.20).

  8. Combining Static Code Analysis and Large Language Models Improves Correctness and Performance of Algorithm Recognition

    cs.SE 2026-04 conditional novelty 4.0

    Hybrid LLM plus static analysis for algorithm recognition in code cuts required model calls by 72-97% and lifts F1-scores by as much as 12 points.

  9. Skeleton-based Coherence Modeling in Narratives

    cs.CL 2026-04 unverdicted novelty 4.0

    Sentence-level models outperform skeleton-based approaches for narrative coherence despite a new SSN network improving on cosine and Euclidean baselines.

  10. LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

    cs.CL 2024-12 accept novelty 3.0

    A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.