Introduces the TCR framework to evaluate educational LLM assistants on transparency, consistency, and refinement in multi-turn interactions, complementing aggregate metrics.
Advances in Neural Information Processing Systems , volume=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
dataset 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
dataset 1polarities
use dataset 1representative citing papers
Develops a margin-adaptive learned confidence estimator for LLMs with generalization guarantees to improve agreement rates with human judgments over heuristic baselines.
citing papers explorer
-
Evaluating Multi-turn Human-AI Interaction
Introduces the TCR framework to evaluate educational LLM assistants on transparency, consistency, and refinement in multi-turn interactions, complementing aggregate metrics.
-
Margin-Adaptive Confidence Ranking for Reliable LLM Judgement
Develops a margin-adaptive learned confidence estimator for LLMs with generalization guarantees to improve agreement rates with human judgments over heuristic baselines.