An empirical study distills a taxonomy of human factual errors from newspaper corrections and shows LLMs achieve only 52% F1 on detection.
In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
First systematic test shows activation steering robustness drops sharply (up to 64%) under adversarial input perturbations across multiple extraction methods, models, and personas.
Open-weight instruction-aware encoders capture equal or greater affective information than proprietary models at word level across emotion theories, while task-tuned and proprietary encoders perform best on sentence-level classification.
citing papers explorer
-
An Empirical Analysis of Factual Errors in Human-Written Text and its Application
An empirical study distills a taxonomy of human factual errors from newspaper corrections and shows LLMs achieve only 52% F1 on detection.
-
Adversarial Robustness of Activation Steering in Large Language Models
First systematic test shows activation steering robustness drops sharply (up to 64%) under adversarial input perturbations across multiple extraction methods, models, and personas.
-
A Comparative Study on Affective Cues in Text Embeddings Across Psychological Emotion Theories
Open-weight instruction-aware encoders capture equal or greater affective information than proprietary models at word level across emotion theories, while task-tuned and proprietary encoders perform best on sentence-level classification.