pith. sign in

arxiv: 2505.20925 · v1 · pith:TWLO4CJQnew · submitted 2025-05-27 · 💻 cs.CL · cs.AI

Multi-objective Large Language Model Alignment with Hierarchical Experts

classification 💻 cs.CL cs.AI
keywords textitpreferencesacrossexpertshierarchicalparetoalignmentdiverse
0
0 comments X
read the original abstract

Aligning large language models (LLMs) to simultaneously satisfy multiple objectives remains a significant challenge, especially given the diverse and often conflicting nature of human preferences. Existing alignment methods struggle to balance trade-offs effectively, often requiring costly retraining or yielding suboptimal results across the Pareto frontier of preferences. In this paper, we introduce \textit{HoE}(Hierarchical Mixture-of-Experts), a \textit{lightweight}, \textit{parameter-efficient}, and \textit{plug-and-play} approach that eliminates the need for model training, while enabling LLMs to adapt across the entire Pareto frontier and accommodate diverse user preferences. In particular, \textit{HoE} consists of three hierarchical components: LoRA Experts, Router Experts and Preference Routing, reaching optimal Pareto frontiers and achieving a trade-off between parameter size, training cost, and performance. We evaluate \textit{HoE} across various tasks on 14 objectives and 200 different preferences among 6 benchmarks, demonstrating superior performance over 15 recent baselines. Code is available in the supplementary materials.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. SURF: Steering the Scalarization Weight to Uniformly Traverse the Pareto Front

    cs.LG 2026-05 unverdicted novelty 6.0

    SURF derives weight sampling rules from the arc-length CDF of the scalarization path to uniformly traverse the Pareto front in multi-objective optimization.

  2. Dynamic Model Merging Made Slim

    cs.LG 2026-05 unverdicted novelty 6.0

    DiDi-Merging achieves dynamic model merging performance matching or exceeding prior methods while using only 1.24x to 1.4x the parameters of a single fine-tuned model.

  3. Common-agency Games for Multi-Objective Test-Time Alignment

    cs.GT 2026-05 unverdicted novelty 6.0

    CAGE uses common-agency games and an EPEC algorithm to compute equilibrium policies that balance multiple conflicting objectives for test-time LLM alignment.

  4. RVPO: Risk-Sensitive Alignment via Variance Regularization

    cs.LG 2026-05 unverdicted novelty 6.0

    RVPO penalizes variance across multiple reward signals during RLHF advantage aggregation, using a LogSumExp operator as a smooth variance penalty to reduce constraint neglect in LLM alignment.

  5. Personalizing LLMs with Binary Feedback: A Preference-Corrected Optimization Framework

    cs.CL 2026-05 unverdicted novelty 5.0

    C-BPO personalizes LLMs via preference-calibrated binary signals and PU learning theory to isolate inter-user differences from shared task knowledge.