ProactBench measures LLM conversational proactivity in three phases using 198 multi-agent dialogues and finds recovery behavior hard to predict from existing benchmarks.
A survey on personalized and pluralistic preference alignment in large language models
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 7roles
background 3polarities
background 3representative citing papers
PARL formulates personalized LLM evaluation as a learning problem that induces preference-aware rubrics from raw user histories via discriminative RL and self-validation.
TIPO applies preference-intensity weighting and padding gating to stabilize preference optimization for privacy personalization in mobile GUI agents, yielding higher alignment and distinction metrics than prior methods.
A survey that maps safety risks in personalized LLMs, introduces a unified taxonomy, and highlights three structural inadequacies in existing research on user-invariant safety, isolated techniques, and short-term evaluations.
A tradeoff model shows generative AI can reduce bias against diverse preferences by strategically eliciting information instead of always inferring from majority patterns.
POPI distills user preferences into reusable natural-language summaries via a shared inference model and conditions a generator on them, trained jointly with RL to improve personalization quality while cutting context length by up to 10x on benchmarks.
A literature survey across cognitive science, sociolinguistics, and AI alignment that identifies the absence of unified frameworks for embedding cognition, culture, values, and cooperation into multi-agent LLM systems and outlines future directions.
citing papers explorer
-
ProactBench: Beyond What The User Asked For
ProactBench measures LLM conversational proactivity in three phases using 198 multi-agent dialogues and finds recovery behavior hard to predict from existing benchmarks.
-
Preference-Aware Rubric Learning for Personalized Evaluation
PARL formulates personalized LLM evaluation as a learning problem that induces preference-aware rubrics from raw user histories via discriminative RL and self-validation.
-
Mobile GUI Agent Privacy Personalization with Trajectory Induced Preference Optimization
TIPO applies preference-intensity weighting and padding gating to stabilize preference optimization for privacy personalization in mobile GUI agents, yielding higher alignment and distinction metrics than prior methods.
-
Personalization Meets Safety:Mechanisms,Risks,and Mitigations in Personalized LLMs
A survey that maps safety risks in personalized LLMs, introduces a unified taxonomy, and highlights three structural inadequacies in existing research on user-invariant safety, isolated techniques, and short-term evaluations.
-
When to Ask a Question: Understanding Communication Strategies in Generative AI Tools
A tradeoff model shows generative AI can reduce bias against diverse preferences by strategically eliciting information instead of always inferring from majority patterns.
-
POPI: Personalizing LLMs via Optimized Natural Language Preference Inference
POPI distills user preferences into reusable natural-language summaries via a shared inference model and conditions a generator on them, trained jointly with RL to improve personalization quality while cutting context length by up to 10x on benchmarks.
-
Toward Human-Centered Multi-Agent Systems: Integrating Cognition, Culture, Values, and Cooperation in AI Agents
A literature survey across cognitive science, sociolinguistics, and AI alignment that identifies the absence of unified frameworks for embedding cognition, culture, values, and cooperation into multi-agent LLM systems and outlines future directions.