PLC uses dynamic lenient gradient updates in a game-theoretic setup to let multi-preference LLM optimization escape local equilibria and reach better global Pareto fronts.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Beyond Compromise: Pareto-Lenient Consensus for Efficient Multi-Preference LLM Alignment
PLC uses dynamic lenient gradient updates in a game-theoretic setup to let multi-preference LLM optimization escape local equilibria and reach better global Pareto fronts.