Empirical audit of LAION-2B-en and LAION-2B-multi finds overrepresentation of young adults, White people, and males plus stereotypical emotion associations across two attribute classifiers.
WorldBench: Quantifying geographic disparities in LLM factual recall
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 6roles
background 1polarities
background 1representative citing papers
Factual recall quality in LLMs follows a sigmoid scaling law in the log-linear combination of model parameter count and topic frequency in training data, explaining 60% of variance across models and up to 94% within families.
Each tested LLM shows its own characteristic unreliability when engaging in repair during extended math-question dialogues.
13 participants became convinced AI understands human values after chatbot interactions evaluated with the VAPT toolkit.
SafeScreen enforces individualized safety constraints as a prerequisite for video retrieval by using profile extraction, adaptive VideoRAG analysis, and LLM decision-making to approve content for vulnerable users.
A scoping review of AIES and FAccT literature concludes that AI trustworthiness research prioritizes technical precision over social, ethical, and institutional factors, leaving the sociotechnical nature of AI systems underexplored.
citing papers explorer
No citing papers match the current filters.