Review history
Likelihood scoring for continuations of mathematical text: a self-supervised benchmark with tests for shortcut vulnerabilities
-
2026-05-19 UNVERDICTED
-
2026-05-12 UNVERDICTED
Likelihood scoring for continuations of mathematical text: a self-supervised benchmark with tests for shortcut vulnerabilities