A monotone co-design framework for neural network processors that treats uncertainty via Confidence as a tunable resource and allows modular block refinement without structural changes.
arXiv preprint arXiv:2006.08198 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
DPOP is a new loss function that prevents DPO from lowering preferred response likelihoods and outperforms standard DPO on diverse datasets, MT-Bench, and enables Smaug-72B to exceed 80% on the Open LLM Leaderboard.
citing papers explorer
-
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive
DPOP is a new loss function that prevents DPO from lowering preferred response likelihoods and outperforms standard DPO on diverse datasets, MT-Bench, and enables Smaug-72B to exceed 80% on the Open LLM Leaderboard.