pith. sign in

hub

MT-Bench-101: A fine-grained benchmark for evaluating large language models in multi-turn dialogues

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

hub tools

citation-role summary

dataset 2 background 1

citation-polarity summary

years

2026 8 2025 4

clear filters

representative citing papers

TRINITY: An Evolved LLM Coordinator

cs.LG · 2025-12-04 · unverdicted · novelty 6.0

A compact 0.6B-parameter coordinator with a 10K-parameter head uses evolutionary strategy to dynamically delegate roles to LLMs, achieving SOTA results such as 86.2% on LiveCodeBench.

citing papers explorer

Showing 12 of 12 citing papers.