PALM produces a small portfolio of LLMs that contains a near-optimal model for any user preference weight vector, with theoretical bounds on portfolio size and approximation quality.
Simultaneous multi-objective alignment across verifiable and non-verifiable rewards
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
SURF derives weight sampling rules from the arc-length CDF of the scalarization path to uniformly traverse the Pareto front in multi-objective optimization.
citing papers explorer
-
SURF: Steering the Scalarization Weight to Uniformly Traverse the Pareto Front
SURF derives weight sampling rules from the arc-length CDF of the scalarization path to uniformly traverse the Pareto front in multi-objective optimization.