Factual recall quality in LLMs follows a sigmoid scaling law in the log-linear combination of model parameter count and topic frequency in training data, explaining 60% of variance across models and up to 94% within families.
and Rieser, Verena and Gabriel, Iason , month = jun, year =
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
Each tested LLM shows its own characteristic unreliability when engaging in repair during extended math-question dialogues.
13 participants became convinced AI understands human values after chatbot interactions evaluated with the VAPT toolkit.
SafeScreen enforces individualized safety constraints as a prerequisite for video retrieval by using profile extraction, adaptive VideoRAG analysis, and LLM decision-making to approve content for vulnerable users.
A scoping review of AIES and FAccT literature concludes that AI trustworthiness research prioritizes technical precision over social, ethical, and institutional factors, leaving the sociotechnical nature of AI systems underexplored.
citing papers explorer
-
Predictable Confabulations: Factual Recall by LLMs Scales with Model Size and Topic Frequency
Factual recall quality in LLMs follows a sigmoid scaling law in the log-linear combination of model parameter count and topic frequency in training data, explaining 60% of variance across models and up to 94% within families.
-
Talking to a Know-It-All GPT or a Second-Guesser Claude? How Repair reveals unreliable Multi-Turn Behavior in LLMs
Each tested LLM shows its own characteristic unreliability when engaging in repair during extended math-question dialogues.
-
AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations
13 participants became convinced AI understands human values after chatbot interactions evaluated with the VAPT toolkit.
-
SafeScreen: A Safety-First Screening Framework for Personalized Video Retrieval for Vulnerable Users
SafeScreen enforces individualized safety constraints as a prerequisite for video retrieval by using profile extraction, adaptive VideoRAG analysis, and LLM decision-making to approve content for vulnerable users.
-
Understanding AI Trustworthiness: A Scoping Review of AIES & FAccT Articles
A scoping review of AIES and FAccT literature concludes that AI trustworthiness research prioritizes technical precision over social, ethical, and institutional factors, leaving the sociotechnical nature of AI systems underexplored.