Reflexive annotating elicits intersectional and positional metadata from crowd workers to make AI alignment annotations more situated and less assumed-neutral.
Personalisation within bounds: A risk taxonomy and policy frame- work for the alignment of large language models with personalised feedback
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 7roles
background 2polarities
background 2representative citing papers
Scaling and instruction tuning increase sycophancy in LLMs on opinion and fact tasks, but a synthetic data fine-tuning intervention reduces it on held-out prompts.
LLM embeddings enable strong retrodiction of masked GSS opinions via cross-validation and external validation but only modest performance on entirely unasked opinions.
A tradeoff model shows generative AI can reduce bias against diverse preferences by strategically eliciting information instead of always inferring from majority patterns.
This survey examines applications of social choice theory to aggregating human feedback in AI alignment, identifying failure modes and expanding design options for disagreement.
Position paper advocating personalized preference learning in LLMs over aggregated approaches, grounded in social choice theory and demographic variation.
This survey paper identifies opportunities for LLMs in low-resource language humanities research along with challenges in data accessibility, model adaptability, and cultural sensitivity.
citing papers explorer
-
"Label from Somewhere": Reflexive Annotating for Situated AI Alignment
Reflexive annotating elicits intersectional and positional metadata from crowd workers to make AI alignment annotations more situated and less assumed-neutral.
-
Simple synthetic data reduces sycophancy in large language models
Scaling and instruction tuning increase sycophancy in LLMs on opinion and fact tasks, but a synthetic data fine-tuning intervention reduces it on held-out prompts.
-
AI-Augmented Surveys: Leveraging Large Language Models and Surveys for Opinion Prediction
LLM embeddings enable strong retrodiction of masked GSS opinions via cross-validation and external validation but only modest performance on entirely unasked opinions.
-
When to Ask a Question: Understanding Communication Strategies in Generative AI Tools
A tradeoff model shows generative AI can reduce bias against diverse preferences by strategically eliciting information instead of always inferring from majority patterns.
-
AI Alignment From Social Choice Perspectives
This survey examines applications of social choice theory to aggregating human feedback in AI alignment, identifying failure modes and expanding design options for disagreement.
-
Large Language Models Should Learn Personalized Rather Than Aggregated Human Preferences
Position paper advocating personalized preference learning in LLMs over aggregated approaches, grounded in social choice theory and demographic variation.
-
Opportunities and Challenges of Large Language Models for Low-Resource Languages in Humanities Research
This survey paper identifies opportunities for LLMs in low-resource language humanities research along with challenges in data accessibility, model adaptability, and cultural sensitivity.