Hurt me plenty

Read dialogue, continue by pressing ’A’

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

VideoGameBench: Can Vision-Language Models complete popular video games?

cs.AI · 2025-05-23 · unverdicted · novelty 7.0

Frontier vision-language models complete only 0.48% of VideoGameBench and 1.6% of its paused Lite version, a new real-time benchmark of 10 1990s games with raw visuals and minimal scaffolding.

citing papers explorer

Showing 1 of 1 citing paper.

VideoGameBench: Can Vision-Language Models complete popular video games? cs.AI · 2025-05-23 · unverdicted · none · ref 32
Frontier vision-language models complete only 0.48% of VideoGameBench and 1.6% of its paused Lite version, a new real-time benchmark of 10 1990s games with raw visuals and minimal scaffolding.

Hurt me plenty

fields

years

verdicts

representative citing papers

citing papers explorer