Dr. Post-Training reframes general data as a data-induced regularizer for LLM post-training updates, yielding a family of methods that outperform data-selection baselines on SFT, RLHF, and RLVR tasks.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
APAPC integrates Nesterov acceleration into primal-dual forward-backward schemes by exploiting dual strong convexity to achieve optimal sublinear and accelerated linear convergence rates.
citing papers explorer
-
Dr. Post-Training: A Data Regularization Perspective on LLM Post-Training
Dr. Post-Training reframes general data as a data-induced regularizer for LLM post-training updates, yielding a family of methods that outperform data-selection baselines on SFT, RLHF, and RLVR tasks.
-
A Nesterov-Accelerated Primal-Dual Splitting Algorithm for Convex Nonsmooth Optimization
APAPC integrates Nesterov acceleration into primal-dual forward-backward schemes by exploiting dual strong convexity to achieve optimal sublinear and accelerated linear convergence rates.