MINT: Evaluating LLMs in multi-turn interaction with tools and language feedback

Xingyao Wang, Zihan Wang, Jiateng Liu, Yangyi Chen, Lifan Yuan, Hao Peng, Heng Ji · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

citation-role summary

background 1

background 1

cs.CL · 2025-05-21 · unverdicted · novelty 7.0

MTR-Bench is a new automated benchmark for multi-turn reasoning in LLMs covering diverse tasks and difficulty levels with 3600 instances.

Showing 1 of 1 citing paper.

MTR-Bench: A Comprehensive Benchmark for Multi-Turn Reasoning Evaluation cs.CL · 2025-05-21 · unverdicted · none · ref 37
MTR-Bench is a new automated benchmark for multi-turn reasoning in LLMs covering diverse tasks and difficulty levels with 3600 instances.