pith. the verified trust layer for science. sign in

Revisiting the evaluation of theory of mind through question answering

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.LG 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

ProactBench: Beyond What The User Asked For

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

ProactBench measures LLM conversational proactivity in three phases using 198 multi-agent dialogues and finds recovery behavior hard to predict from existing benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

  • ProactBench: Beyond What The User Asked For cs.LG · 2026-05-09 · unverdicted · none · ref 121

    ProactBench measures LLM conversational proactivity in three phases using 198 multi-agent dialogues and finds recovery behavior hard to predict from existing benchmarks.