Cross-family LLM differences outweigh within-family scaling on orthographic constraint puzzles, with modest human difficulty calibration but systematic failures on atypical common words.
The experimental instances are drawn from the New Y ork Times Spelling Bee for research and evaluation purposes consistent with fair use princi- ples
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Orthographic Constraint Satisfaction and Human Difficulty Alignment in Large Language Models
Cross-family LLM differences outweigh within-family scaling on orthographic constraint puzzles, with modest human difficulty calibration but systematic failures on atypical common words.