pith. sign in

arxiv: 2510.05141 · v2 · pith:4ENE6T4Bnew · submitted 2025-10-01 · 💻 cs.CL

To model human linguistic prediction, make LLMs less superhuman

classification 💻 cs.CL
keywords llmshumanmakememorypredictionsupcomingwordsability
0
0 comments X
read the original abstract

When we read, we make predictions about upcoming words; these predictions influence our reading behavior. The success of large language models (LLMs), which, like humans, make predictions about upcoming words, has motivated their use as models of human linguistic prediction. Surprisingly, in the last few years, as LLMs' ability to predict the next word has improved, their ability to explain reading behavior has declined. We argue this is because current LLMs can predict upcoming words much better than human readers can. This 'superhumanness' is driven by LLMs' extensive training data, stronger long-term memory of training examples, and stronger short-term memory. We advocate for LLMs with human-like memory and for new experiments to measure the alignment between humans and LLMs, and outline directions towards achieving these goals.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Reinforcing Human Behavior Simulation via Verbal Feedback

    cs.LG 2026-05 unverdicted novelty 6.0

    DITTO uses RL with verbal feedback to train LLMs for human behavior simulation, reporting 36% average gains over base models and outperforming GPT-5.4 on 6 of 10 SOUL benchmark tasks.

  2. Why are language models less surprised than humans? Testing the Parse Multiplicity Mismatch Hypothesis

    cs.CL 2026-05 conditional novelty 6.0

    Varying the number of simultaneous parses in RNNGs increases predicted garden-path effects but does not fully reconcile LM surprisal with human reading times.