A new paired-prompt protocol reveals alignment-pipeline-specific heterogeneity in how open-weight LLMs respond to evaluation versus deployment framings.
Smith, Nicole DeCario, and Will Buchanan
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6verdicts
UNVERDICTED 6representative citing papers
Large-scale statistical analysis of four harmful language datasets reveals that interactions between annotator characteristics and linguistic cues drive annotation variation, with lexical features and attitudes prominent but patterns varying by dataset.
A scoping review and empirical analysis produce a six-category taxonomy of factors driving AI non-development and abandonment, showing that practical issues like resource limits and organizational dynamics often outweigh ethical concerns in real decisions.
YouTube's recommendation algorithm prioritizes non-Kyrgyz content for Kyrgyz children, validating community concerns about cultural and linguistic erosion.
A framework is introduced that links activist needs (minimal overhead, community building, safety, sustainability) to DSN affordances and is applied to compare Mastodon and Bluesky plus example communities.
The thesis identifies theoretical, empirical, and conceptual flaws in offline fairness measures for recommender systems and contributes new evaluation methods and practical guidelines.
citing papers explorer
-
Measuring Evaluation-Context Divergence in Open-Weight LLMs: A Paired-Prompt Protocol with Pilot Evidence of Alignment-Pipeline-Specific Heterogeneity
A new paired-prompt protocol reveals alignment-pipeline-specific heterogeneity in how open-weight LLMs respond to evaluation versus deployment framings.
-
Who and What? Using Linguistic Features and Annotator Characteristics to Analyze Annotation Variation
Large-scale statistical analysis of four harmful language datasets reveals that interactions between annotator characteristics and linguistic cues drive annotation variation, with lexical features and attitudes prominent but patterns varying by dataset.
-
To Build or Not to Build? Factors that Lead to Non-Development or Abandonment of AI Systems
A scoping review and empirical analysis produce a six-category taxonomy of factors driving AI non-development and abandonment, showing that practical issues like resource limits and organizational dynamics often outweigh ethical concerns in real decisions.
-
Empire Amplifier: Uncovering and Contesting the Prioritization of Colonial Content on Platforms Through Community-Informed Algorithmic Auditing
YouTube's recommendation algorithm prioritizes non-Kyrgyz content for Kyrgyz children, validating community concerns about cultural and linguistic erosion.
-
The Activist's Guide to the Decentralized Social Universe: A Framework for Exploring How Decentralized Social Networks Can Support Collective Action
A framework is introduced that links activist needs (minimal overhead, community building, safety, sustainability) to DSN affordances and is applied to compare Mastodon and Bluesky plus example communities.
-
Offline Evaluation Measures of Fairness in Recommender Systems
The thesis identifies theoretical, empirical, and conceptual flaws in offline fairness measures for recommender systems and contributes new evaluation methods and practical guidelines.