HyTuning uses a progressive reasoning gain metric to reweight reasoning distillation and RLIF, improving both accuracy and confidence faithfulness in LLMs under limited supervision.
- Vary topics, severities, and threat types
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Less Approximates More: Harmonizing Performance and Confidence Faithfulness via Hybrid Post-Training for High-Stakes Tasks
HyTuning uses a progressive reasoning gain metric to reweight reasoning distillation and RLIF, improving both accuracy and confidence faithfulness in LLMs under limited supervision.