MASS-DPO derives a Plackett-Luce-specific log-determinant Fisher information objective to select non-redundant negative samples, matching or exceeding multi-negative DPO performance with substantially fewer negatives across four benchmarks and three model families.
Near-optimal nonmyopic value of information in graphical models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
A framework models scalar fields with Gaussian processes and uses Hough transforms on the posterior to detect and avoid high-intensity unsafe regions for safe robot mapping.
citing papers explorer
-
MASS-DPO: Multi-negative Active Sample Selection for Direct Policy Optimization
MASS-DPO derives a Plackett-Luce-specific log-determinant Fisher information objective to select non-redundant negative samples, matching or exceeding multi-negative DPO performance with substantially fewer negatives across four benchmarks and three model families.
-
A Hough transform approach to safety-aware scalar field mapping using Gaussian Processes
A framework models scalar fields with Gaussian processes and uses Hough transforms on the posterior to detect and avoid high-intensity unsafe regions for safe robot mapping.