To model human linguistic prediction, make LLMs less superhuman

Byung-Doh Oh, Tal Linzen · 2025 · arXiv 2510.05141

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

representative citing papers

Reinforcing Human Behavior Simulation via Verbal Feedback

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

DITTO uses RL with verbal feedback to train LLMs for human behavior simulation, reporting 36% average gains over base models and outperforming GPT-5.4 on 6 of 10 SOUL benchmark tasks.

Why are language models less surprised than humans? Testing the Parse Multiplicity Mismatch Hypothesis

cs.CL · 2026-05-14 · conditional · novelty 6.0

Varying the number of simultaneous parses in RNNGs increases predicted garden-path effects but does not fully reconcile LM surprisal with human reading times.

citing papers explorer

Showing 2 of 2 citing papers.

Reinforcing Human Behavior Simulation via Verbal Feedback cs.LG · 2026-05-19 · unverdicted · none · ref 23 · internal anchor
DITTO uses RL with verbal feedback to train LLMs for human behavior simulation, reporting 36% average gains over base models and outperforming GPT-5.4 on 6 of 10 SOUL benchmark tasks.
Why are language models less surprised than humans? Testing the Parse Multiplicity Mismatch Hypothesis cs.CL · 2026-05-14 · conditional · none · ref 71 · internal anchor
Varying the number of simultaneous parses in RNNGs increases predicted garden-path effects but does not fully reconcile LM surprisal with human reading times.

To model human linguistic prediction, make LLMs less superhuman

fields

years

verdicts

representative citing papers

citing papers explorer