pith.
Research
Integrity
Review
Publish
sign in
Physics
Mathematics
Computer Science
Biology
Finance
Statistics
Systems
Economics
← back to paper
Review history
arxiv:
2510.10930
· 2 revisions
Evaluating Language Models' Evaluations of Games
2026-05-21
UNVERDICTED
LOW
v0.9.0
novelty 6.0
68635 ms
5832 in
1295 out
2026-05-21T20:52:03.862431+00:00
2026-05-18
UNVERDICTED
LOW
v0.9.0
novelty 7.0
34579 ms
5832 in
1202 out
2026-05-18T08:26:06.648592+00:00