pith. sign in

Efficient selectivity and backup operators in monte-carlo tree search

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

citation-role summary

baseline 1

citation-polarity summary

fields

cs.AI 1

years

2025 1

verdicts

UNVERDICTED 1

roles

baseline 1

polarities

baseline 1

representative citing papers

VS-Bench: Evaluating VLMs for Strategic Abilities in Multi-Agent Environments

cs.AI · 2025-06-03 · unverdicted · novelty 6.0

VS-Bench is a new benchmark of ten visual multi-agent environments that measures VLMs on element recognition, next-action prediction, and normalized episode return, showing strong perception but large gaps in reasoning and decision-making with the best model at 46.6% prediction accuracy and 31.4% of

citing papers explorer

Showing 1 of 1 citing paper.

  • VS-Bench: Evaluating VLMs for Strategic Abilities in Multi-Agent Environments cs.AI · 2025-06-03 · unverdicted · none · ref 16

    VS-Bench is a new benchmark of ten visual multi-agent environments that measures VLMs on element recognition, next-action prediction, and normalized episode return, showing strong perception but large gaps in reasoning and decision-making with the best model at 46.6% prediction accuracy and 31.4% of