Revisiting the evaluation of theory of mind through question answering

Matthew Le, Y-Lan Boureau, Maximilian Nickel · 2019

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

ProactBench: Beyond What The User Asked For

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

ProactBench measures LLM conversational proactivity in three phases using 198 multi-agent dialogues and finds recovery behavior hard to predict from existing benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

ProactBench: Beyond What The User Asked For cs.LG · 2026-05-09 · unverdicted · none · ref 121
ProactBench measures LLM conversational proactivity in three phases using 198 multi-agent dialogues and finds recovery behavior hard to predict from existing benchmarks.

Revisiting the evaluation of theory of mind through question answering

fields

years

verdicts

representative citing papers

citing papers explorer