MASS-DPO derives a Plackett-Luce-specific log-determinant Fisher information objective to select non-redundant negative samples, matching or exceeding multi-negative DPO performance with substantially fewer negatives across four benchmarks and three model families.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
Classification and structural description of simple involutive latin solutions to the Yang-Baxter equation with regular displacement group and nilpotent permutation group, including enumeration for size p^p.
citing papers explorer
-
MASS-DPO: Multi-negative Active Sample Selection for Direct Policy Optimization
MASS-DPO derives a Plackett-Luce-specific log-determinant Fisher information objective to select non-redundant negative samples, matching or exceeding multi-negative DPO performance with substantially fewer negatives across four benchmarks and three model families.
-
Involutive (simple) latin solutions of the Yang-Baxter equation and related (left) quasigroups
Classification and structural description of simple involutive latin solutions to the Yang-Baxter equation with regular displacement group and nilpotent permutation group, including enumeration for size p^p.