A new evaluation framework using MMD on Biber features shows LLMs deviate from human linguistic distributions across registers, with closest models varying by register rather than size.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2verdicts
UNVERDICTED 2representative citing papers
Adapts multi-layer token-level Mahalanobis distance with supervised linear regression to yield improved uncertainty scores for LLM truthfulness tasks.
citing papers explorer
-
How Human-Like Are Large Language Models? A Register-Aware Linguistic Evaluation Framework
A new evaluation framework using MMD on Biber features shows LLMs deviate from human linguistic distributions across registers, with closest models varying by register rather than size.