Multi-objective Large Language Model Alignment with Hierarchical Experts

Deheng Ye; Fangming Liu; Guodong Du; Jing Li; Min Zhang; Weiyang Guo; Wenya Wang; Xiucheng Li; Yequan Wang; Yigeng Zhou

arxiv: 2505.20925 · v1 · pith:TWLO4CJQnew · submitted 2025-05-27 · 💻 cs.CL · cs.AI

Multi-objective Large Language Model Alignment with Hierarchical Experts

Zhuo Li , Guodong Du , Weiyang Guo , Yigeng Zhou , Xiucheng Li , Wenya Wang , Fangming Liu , Yequan Wang

show 3 more authors

Deheng Ye Min Zhang Jing Li

This is my paper

classification 💻 cs.CL cs.AI

keywords textitpreferencesacrossexpertshierarchicalparetoalignmentdiverse

0 comments

read the original abstract

Aligning large language models (LLMs) to simultaneously satisfy multiple objectives remains a significant challenge, especially given the diverse and often conflicting nature of human preferences. Existing alignment methods struggle to balance trade-offs effectively, often requiring costly retraining or yielding suboptimal results across the Pareto frontier of preferences. In this paper, we introduce \textit{HoE}(Hierarchical Mixture-of-Experts), a \textit{lightweight}, \textit{parameter-efficient}, and \textit{plug-and-play} approach that eliminates the need for model training, while enabling LLMs to adapt across the entire Pareto frontier and accommodate diverse user preferences. In particular, \textit{HoE} consists of three hierarchical components: LoRA Experts, Router Experts and Preference Routing, reaching optimal Pareto frontiers and achieving a trade-off between parameter size, training cost, and performance. We evaluate \textit{HoE} across various tasks on 14 objectives and 200 different preferences among 6 benchmarks, demonstrating superior performance over 15 recent baselines. Code is available in the supplementary materials.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

SURF: Steering the Scalarization Weight to Uniformly Traverse the Pareto Front
cs.LG 2026-05 unverdicted novelty 6.0

SURF derives weight sampling rules from the arc-length CDF of the scalarization path to uniformly traverse the Pareto front in multi-objective optimization.
Dynamic Model Merging Made Slim
cs.LG 2026-05 unverdicted novelty 6.0

DiDi-Merging achieves dynamic model merging performance matching or exceeding prior methods while using only 1.24x to 1.4x the parameters of a single fine-tuned model.
Common-agency Games for Multi-Objective Test-Time Alignment
cs.GT 2026-05 unverdicted novelty 6.0

CAGE uses common-agency games and an EPEC algorithm to compute equilibrium policies that balance multiple conflicting objectives for test-time LLM alignment.
RVPO: Risk-Sensitive Alignment via Variance Regularization
cs.LG 2026-05 unverdicted novelty 6.0

RVPO penalizes variance across multiple reward signals during RLHF advantage aggregation, using a LogSumExp operator as a smooth variance penalty to reduce constraint neglect in LLM alignment.
Personalizing LLMs with Binary Feedback: A Preference-Corrected Optimization Framework
cs.CL 2026-05 unverdicted novelty 5.0

C-BPO personalizes LLMs via preference-calibrated binary signals and PU learning theory to isolate inter-user differences from shared task knowledge.