Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies
-
2026-05-15
UNVERDICTED
LOW
v0.9.0
novelty 8.0
45842 ms
5606 in
1278 out
2026-05-15T07:04:25.127126+00:00
-
2026-05-13
CONDITIONAL
LOW
v0.9.0
novelty 7.0
30685 ms
5608 in
973 out
2026-05-13T07:34:41.870601+00:00
-
2026-05-12
UNVERDICTED
LOW
v0.9.0
novelty 8.0
121634 ms
5608 in
1303 out
2026-05-12T04:02:24.053344+00:00
-
2026-05-07
UNVERDICTED
LOW
v0.9.0
novelty 8.0
74078 ms
5609 in
1393 out
2026-05-07T16:32:24.172683+00:00
-
2026-05-07
UNVERDICTED
LOW
v0.9.0
novelty 7.0
27457 ms
5587 in
1103 out
2026-05-07T01:28:24.242567+00:00