LLMs display inconsistent factual recall across different surface forms of the same entity, with greater robustness to minor spelling changes than to aliases or abbreviations.
Toward the Evaluation of Large Language Models Considering Score Variance across Instruction Templates
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Edit-level majority voting on multiple LLM-generated candidates reduces over-correction in grammatical error correction and outperforms greedy and MBR decoding on nine multilingual benchmarks while remaining stable to prompt variations.
citing papers explorer
-
Revisiting Non-Verbatim Memorization in Large Language Models: The Role of Entity Surface Forms
LLMs display inconsistent factual recall across different surface forms of the same entity, with greater robustness to minor spelling changes than to aliases or abbreviations.
-
Edit-level Majority Voting Mitigates Over-Correction in LLM-based Grammatical Error Correction
Edit-level majority voting on multiple LLM-generated candidates reduces over-correction in grammatical error correction and outperforms greedy and MBR decoding on nine multilingual benchmarks while remaining stable to prompt variations.