On one hand we prompt larger LLMs to evaluate two stories generated by smaller LLMs, on the other hand we compare a story from a larger LLM with a story from a smaller LLM

vs QwQ-32B (Qwen et al · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

StoryAlign: Evaluating and Training Reward Models for Story Generation

cs.CL · 2026-05-06 · unverdicted · novelty 7.0

StoryReward, trained on a new 100k story preference dataset, sets state-of-the-art performance on the introduced StoryRMB benchmark for aligning LLM stories with human preferences.

citing papers explorer

Showing 1 of 1 citing paper.

StoryAlign: Evaluating and Training Reward Models for Story Generation cs.CL · 2026-05-06 · unverdicted · none · ref 36
StoryReward, trained on a new 100k story preference dataset, sets state-of-the-art performance on the introduced StoryRMB benchmark for aligning LLM stories with human preferences.

On one hand we prompt larger LLMs to evaluate two stories generated by smaller LLMs, on the other hand we compare a story from a larger LLM with a story from a smaller LLM

fields

years

verdicts

representative citing papers

citing papers explorer