Review history

arxiv: 2604.08362 · 2 revisions

Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces

2026-05-22 UNVERDICTED LOW v0.9.0 novelty 7.0

66873 ms 5785 in 1357 out 2026-05-22T10:26:50.614032+00:00
2026-05-10 UNVERDICTED LOW v0.9.0 novelty 7.0

82018 ms 5555 in 1219 out 2026-05-10T18:16:31.449091+00:00