A new evaluation framework shows that even the best tested LLM only reliably adjusts response complexity in the intended direction 46% of the time across 98 scientific queries.
An Audit on the Perspectives and Challenges of Hallucinations in NLP
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
AInterviewer is an open-source multi-agent platform for AI-led qualitative interviews that integrates controlled question administration with LLMs and supports local models via a web GUI.
citing papers explorer
-
Explain Like I'm 5 or Whatever I Choose: Evaluating the Interactive Potential of Language Model Responses
A new evaluation framework shows that even the best tested LLM only reliably adjusts response complexity in the intended direction 46% of the time across 98 scientific queries.
-
AInterviewer: A Platform for Designing and Conducting AI-led Qualitative Interviews
AInterviewer is an open-source multi-agent platform for AI-led qualitative interviews that integrates controlled question administration with LLMs and supports local models via a web GUI.