Dr. Post-Training reframes general data as a data-induced regularizer for LLM post-training updates, yielding a family of methods that outperform data-selection baselines on SFT, RLHF, and RLVR tasks.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
CT-AGD accelerates first-order optimization in deep learning by using finite-difference curvature estimates and noise-mitigation heuristics, achieving equivalent accuracy with 33% fewer training epochs and overhead comparable to Adam.
citing papers explorer
-
Dr. Post-Training: A Data Regularization Perspective on LLM Post-Training
Dr. Post-Training reframes general data as a data-induced regularizer for LLM post-training updates, yielding a family of methods that outperform data-selection baselines on SFT, RLHF, and RLVR tasks.
-
Accelerated Gradient Descent for Faster Convergence with Minimal Overhead
CT-AGD accelerates first-order optimization in deep learning by using finite-difference curvature estimates and noise-mitigation heuristics, achieving equivalent accuracy with 33% fewer training epochs and overhead comparable to Adam.