hub

Hierarchical neural story generation.CoRR, abs/1805.04833

Angela Fan, Mike Lewis, Yann Dauphin · 2018 · cs.CL · arXiv 1805.04833

22 Pith papers cite this work. Polarity classification is still indexing.

22 Pith papers citing it

open full Pith review browse 22 citing papers arXiv PDF

abstract

We explore story generation: creative systems that can build coherent and fluent passages of text about a topic. We collect a large dataset of 300K human-written stories paired with writing prompts from an online forum. Our dataset enables hierarchical story generation, where the model first generates a premise, and then transforms it into a passage of text. We gain further improvements with a novel form of model fusion that improves the relevance of the story to the prompt, and adding a new gated multi-scale self-attention mechanism to model long-range context. Experiments show large improvements over strong baselines on both automated and human evaluations. Human judges prefer stories generated by our approach to those from a strong non-hierarchical model by a factor of two to one.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1 baseline 1 dataset 1 other 1

citation-polarity summary

background 1 baseline 1 unclear 1 use dataset 1

representative citing papers

Improving Sampling for Masked Diffusion Models via Information Gain

cs.CL · 2026-02-20 · unverdicted · novelty 7.0

Info-Gain Sampler improves MDM decoding by using bidirectional information gain to reduce cumulative uncertainty, outperforming greedy samplers on reasoning accuracy and creative writing tasks.

Geometry-Aware Decoding with Wasserstein-Regularized Truncation and Mass Penalties for Large Language Models

cs.CL · 2026-02-10 · unverdicted · novelty 7.0

Top-W applies Wasserstein-regularized truncation on token-embedding geometry to create a closed-form optimal crop for LLM sampling that outperforms prior methods by up to 33.7% on GSM8K, GPQA, AlpacaEval, and MT-Bench.

Top-H Decoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation

cs.CL · 2025-09-02 · unverdicted · novelty 7.0

Top-H decoding is a computationally efficient greedy algorithm for an entropy-constrained mass maximization problem that improves the creativity-coherence trade-off over min-p sampling in LLM text generation.

Constrained Decoding for Safe Robot Navigation Foundation Models

cs.RO · 2025-09-01 · unverdicted · novelty 7.0

SafeDec uses constrained decoding to ensure autoregressive robot navigation foundation models generate actions that provably satisfy STL safety specifications under assumed dynamics.

TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models

cs.CL · 2025-04-29 · unverdicted · novelty 7.0

The authors generate and publicly release the first large-scale open dataset of three million structured moral fables produced by small open language models together with a reproducible LLM-judge evaluation pipeline.

Chronos: Learning the Language of Time Series

cs.LG · 2024-03-12 · conditional · novelty 7.0

Chronos pretrains transformer models on tokenized time series to deliver strong zero-shot forecasting across diverse domains.

The Impact of Off-Policy Training Data on Probe Generalisation

cs.AI · 2025-11-21 · unverdicted · novelty 6.0

Off-policy training data for LLM behavior probes causes significant generalization failures especially for intent-based behaviors like deception, and performance on coerced incentivised data correlates with real on-policy success.

Culinary Crossroads: A RAG Framework for Enhancing Diversity in Cross-Cultural Recipe Adaptation

cs.CL · 2025-07-29 · unverdicted · novelty 6.0

CARRIAGE is a RAG framework that improves output diversity in cross-cultural recipe adaptation by enhancing retrieval and context handling, reaching Pareto efficiency on diversity and quality versus closed-book LLMs.

CTRL: A Conditional Transformer Language Model for Controllable Generation

cs.CL · 2019-09-11 · unverdicted · novelty 6.0

CTRL is a large conditional transformer language model that uses naturally occurring control codes to steer text generation style and content.

Annotations Mitigate Post-Training Mode Collapse

cs.CL · 2026-05-11 · unverdicted · novelty 6.0

Annotation-anchored training reduces semantic diversity collapse in post-trained language models by a factor of six compared to standard supervised fine-tuning while preserving instruction-following and improving with scale.

A Universal Avoidance Method for Diverse Multi-branch Generation

cs.CL · 2026-04-19 · unverdicted · novelty 6.0

UAG is a universal avoidance generation method that increases multi-branch diversity in diffusion and transformer models by penalizing output similarity, delivering up to 1.9x higher diversity with 4.4x speed and 1/64th the FLOPs of prior methods.

Narrix: Remixing Narrative Strategies from Examples for Story Writing

cs.HC · 2026-04-08 · unverdicted · novelty 6.0

Narrix helps novices identify and reuse narrative strategies from examples through visualization and strategy-steered generation, improving retention, confidence, and adaptation over chat interfaces in a 12-person study.

Do Linear Probes Generalize Better in Persona Coordinates?

cs.AI · 2026-05-10 · unverdicted · novelty 5.0 · 2 refs

Persona axes derived from contrastive prompts and PCA yield linear probes that generalize better than raw-activation probes across 10 datasets for deception and sycophancy.

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

cs.LG · 2023-04-13 · unverdicted · novelty 5.0

RAFT aligns generative models by ranking samples with a reward model and fine-tuning only on the top-ranked outputs, reporting gains on reward scores and automated metrics for LLMs and diffusion models.

Visual Interaction with Deep Learning Models through Collaborative Semantic Inference

cs.HC · 2019-07-24 · unverdicted · novelty 5.0

Proposes the CSI framework for co-designing visual interactions and deep learning models to expose and allow semantic control over intermediate reasoning processes, shown in a summarization case study.

APCD: Adaptive Path-Contrastive Decoding for Reliable Large Language Model Generation

cs.CL · 2026-05-10 · unverdicted · novelty 5.0

APCD adaptively branches LLM decoding paths based on token entropy and contrasts divergent paths to improve factual accuracy while preserving efficiency.

Exploring the Effectiveness of Abstract Syntax Tree Patterns for Algorithm Recognition

cs.SE · 2026-05-07 · unverdicted · novelty 5.0

An AST pattern-matching prototype with a custom DSL achieves 0.74 average F1-score on a BigCloneEval subset, outperforming CodeLlama (0.35) and code clone detectors (best recall 0.20).

WriterForcing: Generating more interesting story endings

cs.LG · 2019-07-18 · unverdicted · novelty 4.0

WriterForcing combines keyphrase attention and non-generic word promotion in Seq2Seq models to produce more diverse and interesting story endings.

Combining Static Code Analysis and Large Language Models Improves Correctness and Performance of Algorithm Recognition

cs.SE · 2026-04-03 · conditional · novelty 4.0

Hybrid LLM plus static analysis for algorithm recognition in code cuts required model calls by 72-97% and lifts F1-scores by as much as 12 points.

Skeleton-based Coherence Modeling in Narratives

cs.CL · 2026-04-02 · unverdicted · novelty 4.0

Sentence-level models outperform skeleton-based approaches for narrative coherence despite a new SSN network improving on cosine and Euclidean baselines.

LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

cs.CL · 2024-12-07 · accept · novelty 3.0

A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.

How Human-Like Are Large Language Models? A Register-Aware Linguistic Evaluation Framework

cs.CL · 2026-05-22

citing papers explorer

Showing 22 of 22 citing papers.

Improving Sampling for Masked Diffusion Models via Information Gain cs.CL · 2026-02-20 · unverdicted · none · ref 6 · internal anchor
Info-Gain Sampler improves MDM decoding by using bidirectional information gain to reduce cumulative uncertainty, outperforming greedy samplers on reasoning accuracy and creative writing tasks.
Geometry-Aware Decoding with Wasserstein-Regularized Truncation and Mass Penalties for Large Language Models cs.CL · 2026-02-10 · unverdicted · none · ref 9 · internal anchor
Top-W applies Wasserstein-regularized truncation on token-embedding geometry to create a closed-form optimal crop for LLM sampling that outperforms prior methods by up to 33.7% on GSM8K, GPQA, AlpacaEval, and MT-Bench.
Top-H Decoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation cs.CL · 2025-09-02 · unverdicted · none · ref 6 · internal anchor
Top-H decoding is a computationally efficient greedy algorithm for an entropy-constrained mass maximization problem that improves the creativity-coherence trade-off over min-p sampling in LLM text generation.
Constrained Decoding for Safe Robot Navigation Foundation Models cs.RO · 2025-09-01 · unverdicted · none · ref 17 · internal anchor
SafeDec uses constrained decoding to ensure autoregressive robot navigation foundation models generate actions that provably satisfy STL safety specifications under assumed dynamics.
TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models cs.CL · 2025-04-29 · unverdicted · none · ref 15 · internal anchor
The authors generate and publicly release the first large-scale open dataset of three million structured moral fables produced by small open language models together with a reproducible LLM-judge evaluation pipeline.
Chronos: Learning the Language of Time Series cs.LG · 2024-03-12 · conditional · none · ref 23
Chronos pretrains transformer models on tokenized time series to deliver strong zero-shot forecasting across diverse domains.
The Impact of Off-Policy Training Data on Probe Generalisation cs.AI · 2025-11-21 · unverdicted · none · ref 11 · internal anchor
Off-policy training data for LLM behavior probes causes significant generalization failures especially for intent-based behaviors like deception, and performance on coerced incentivised data correlates with real on-policy success.
Culinary Crossroads: A RAG Framework for Enhancing Diversity in Cross-Cultural Recipe Adaptation cs.CL · 2025-07-29 · unverdicted · none · ref 3 · internal anchor
CARRIAGE is a RAG framework that improves output diversity in cross-cultural recipe adaptation by enhancing retrieval and context handling, reaching Pareto efficiency on diversity and quality versus closed-book LLMs.
CTRL: A Conditional Transformer Language Model for Controllable Generation cs.CL · 2019-09-11 · unverdicted · none · ref 11 · internal anchor
CTRL is a large conditional transformer language model that uses naturally occurring control codes to steer text generation style and content.
Annotations Mitigate Post-Training Mode Collapse cs.CL · 2026-05-11 · unverdicted · none · ref 31
Annotation-anchored training reduces semantic diversity collapse in post-trained language models by a factor of six compared to standard supervised fine-tuning while preserving instruction-following and improving with scale.
A Universal Avoidance Method for Diverse Multi-branch Generation cs.CL · 2026-04-19 · unverdicted · none · ref 37
UAG is a universal avoidance generation method that increases multi-branch diversity in diffusion and transformer models by penalizing output similarity, delivering up to 1.9x higher diversity with 4.4x speed and 1/64th the FLOPs of prior methods.
Narrix: Remixing Narrative Strategies from Examples for Story Writing cs.HC · 2026-04-08 · unverdicted · none · ref 25
Narrix helps novices identify and reuse narrative strategies from examples through visualization and strategy-steered generation, improving retention, confidence, and adaptation over chat interfaces in a 12-person study.
Do Linear Probes Generalize Better in Persona Coordinates? cs.AI · 2026-05-10 · unverdicted · none · ref 9 · 2 links · internal anchor
Persona axes derived from contrastive prompts and PCA yield linear probes that generalize better than raw-activation probes across 10 datasets for deception and sycophancy.
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment cs.LG · 2023-04-13 · unverdicted · none · ref 142 · internal anchor
RAFT aligns generative models by ranking samples with a reward model and fine-tuning only on the top-ranked outputs, reporting gains on reward scores and automated metrics for LLMs and diffusion models.
Visual Interaction with Deep Learning Models through Collaborative Semantic Inference cs.HC · 2019-07-24 · unverdicted · none · ref 26 · internal anchor
Proposes the CSI framework for co-designing visual interactions and deep learning models to expose and allow semantic control over intermediate reasoning processes, shown in a summarization case study.
APCD: Adaptive Path-Contrastive Decoding for Reliable Large Language Model Generation cs.CL · 2026-05-10 · unverdicted · none · ref 58
APCD adaptively branches LLM decoding paths based on token entropy and contrasts divergent paths to improve factual accuracy while preserving efficiency.
Exploring the Effectiveness of Abstract Syntax Tree Patterns for Algorithm Recognition cs.SE · 2026-05-07 · unverdicted · none · ref 45
An AST pattern-matching prototype with a custom DSL achieves 0.74 average F1-score on a BigCloneEval subset, outperforming CodeLlama (0.35) and code clone detectors (best recall 0.20).
WriterForcing: Generating more interesting story endings cs.LG · 2019-07-18 · unverdicted · none · ref 7 · internal anchor
WriterForcing combines keyphrase attention and non-generic word promotion in Seq2Seq models to produce more diverse and interesting story endings.
Combining Static Code Analysis and Large Language Models Improves Correctness and Performance of Algorithm Recognition cs.SE · 2026-04-03 · conditional · none · ref 74
Hybrid LLM plus static analysis for algorithm recognition in code cuts required model calls by 72-97% and lifts F1-scores by as much as 12 points.
Skeleton-based Coherence Modeling in Narratives cs.CL · 2026-04-02 · unverdicted · none · ref 6
Sentence-level models outperform skeleton-based approaches for narrative coherence despite a new SSN network improving on cosine and Euclidean baselines.
LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods cs.CL · 2024-12-07 · accept · none · ref 60
A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.
How Human-Like Are Large Language Models? A Register-Aware Linguistic Evaluation Framework cs.CL · 2026-05-22 · unreviewed · ref 102 · internal anchor

Hierarchical neural story generation.CoRR, abs/1805.04833

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer