NarrativeWorldBench evaluates 21 LLMs on nine narrative metrics across horizons to 200 episodes and introduces N-VSSM, a 256-dimensional variational state-space model that achieves plot-beat F1 >=0.84 with 4x lower compute and wins writer preference on consistency.
Underspecifi- cation in localization: Pitfalls in adapting language technologies across cultures
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
NarrativeWorldBench: A Frontier-Saturated Benchmark and a Latent World Model for Long-Horizon Co-Creative Audio Drama
NarrativeWorldBench evaluates 21 LLMs on nine narrative metrics across horizons to 200 episodes and introduces N-VSSM, a 256-dimensional variational state-space model that achieves plot-beat F1 >=0.84 with 4x lower compute and wins writer preference on consistency.