Large language model hacking: Quantifying the hidden risks of using llms for text annotation

Baumann, Joachim, Röttger, Paul, Urman, Aleksandra, Wendsjö, Albert, Plaza-del-Arco, Flor Miriam, Gruber, Johannes B · 2024 · arXiv 2509.08825

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Agentic-imodels: Evolving agentic interpretability tools via autoresearch

cs.AI · 2026-05-05 · unverdicted · novelty 7.0

Agentic-imodels evolves scikit-learn regressors via an autoresearch loop to jointly boost predictive performance and LLM-simulatability, improving downstream agentic data science tasks by up to 73% on the BLADE benchmark.

Navigating the Conceptual Multiverse

cs.HC · 2026-04-20 · unverdicted · novelty 7.0

The conceptual multiverse system with a verification framework for decision structures helps users in philosophy, AI alignment, and poetry build clearer working maps of open-ended problems by making implicit LLM choices explicit and changeable.

Safe for Whom? Rethinking How We Evaluate the Safety of LLMs for Real Users

cs.AI · 2025-12-11 · unverdicted · novelty 6.0

LLM safety evaluations for personal advice must test responses against diverse user vulnerability profiles, since context-blind ratings overestimate safety and realistic prompt context does not fix the problem.

Researchers waste 80% of LLM annotation costs by classifying one text at a time

cs.CL · 2026-04-04 · accept · novelty 5.0

Batching texts and stacking variables in LLM prompts reduces annotation costs by over 80% while maintaining accuracy within 2pp of single-item baselines for most models, with errors smaller than human inter-coder disagreement.

Making Uncertainty Visible: Multiverse Analysis for Robust Computational Social Science

stat.OT · 2026-05-19 · conditional · novelty 4.0

Multiverse analysis of three published CSS studies reveals substantial variation in findings across methodological decision combinations and identifies cases of computational failure not reported in originals.

citing papers explorer

Showing 5 of 5 citing papers.

Agentic-imodels: Evolving agentic interpretability tools via autoresearch cs.AI · 2026-05-05 · unverdicted · none · ref 38
Agentic-imodels evolves scikit-learn regressors via an autoresearch loop to jointly boost predictive performance and LLM-simulatability, improving downstream agentic data science tasks by up to 73% on the BLADE benchmark.
Navigating the Conceptual Multiverse cs.HC · 2026-04-20 · unverdicted · none · ref 4
The conceptual multiverse system with a verification framework for decision structures helps users in philosophy, AI alignment, and poetry build clearer working maps of open-ended problems by making implicit LLM choices explicit and changeable.
Safe for Whom? Rethinking How We Evaluate the Safety of LLMs for Real Users cs.AI · 2025-12-11 · unverdicted · none · ref 33
LLM safety evaluations for personal advice must test responses against diverse user vulnerability profiles, since context-blind ratings overestimate safety and realistic prompt context does not fix the problem.
Researchers waste 80% of LLM annotation costs by classifying one text at a time cs.CL · 2026-04-04 · accept · none · ref 2
Batching texts and stacking variables in LLM prompts reduces annotation costs by over 80% while maintaining accuracy within 2pp of single-item baselines for most models, with errors smaller than human inter-coder disagreement.
Making Uncertainty Visible: Multiverse Analysis for Robust Computational Social Science stat.OT · 2026-05-19 · conditional · none · ref 33
Multiverse analysis of three published CSS studies reveals substantial variation in findings across methodological decision combinations and identifies cases of computational failure not reported in originals.

Large language model hacking: Quantifying the hidden risks of using llms for text annotation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer