A new paired-prompt protocol reveals alignment-pipeline-specific heterogeneity in how open-weight LLMs respond to evaluation versus deployment framings.
Smith, Nicole DeCario, and Will Buchanan
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6verdicts
UNVERDICTED 6representative citing papers
Large-scale statistical analysis of four harmful language datasets reveals that interactions between annotator characteristics and linguistic cues drive annotation variation, with lexical features and attitudes prominent but patterns varying by dataset.
A scoping review and empirical analysis produce a six-category taxonomy of factors driving AI non-development and abandonment, showing that practical issues like resource limits and organizational dynamics often outweigh ethical concerns in real decisions.
YouTube's recommendation algorithm prioritizes non-Kyrgyz content for Kyrgyz children, validating community concerns about cultural and linguistic erosion.
A framework is introduced that links activist needs (minimal overhead, community building, safety, sustainability) to DSN affordances and is applied to compare Mastodon and Bluesky plus example communities.
The thesis identifies theoretical, empirical, and conceptual flaws in offline fairness measures for recommender systems and contributes new evaluation methods and practical guidelines.
citing papers explorer
-
Offline Evaluation Measures of Fairness in Recommender Systems
The thesis identifies theoretical, empirical, and conceptual flaws in offline fairness measures for recommender systems and contributes new evaluation methods and practical guidelines.