Merging breaks MoE routing via softmax sensitivity; HARC uses Hessian curvature for closed-form router calibration that improves merged model performance without retraining.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
JTS trains reasoning models via supervised warm-up and missing-premise RL to make an explicit answerability commitment that triggers early termination on unanswerable inputs, raising Abstention@Detection near saturation.
citing papers explorer
-
When Model Merging Breaks Routing: Training-Free Calibration for MoE
Merging breaks MoE routing via softmax sensitivity; HARC uses Hessian curvature for closed-form router calibration that improves merged model performance without retraining.
-
Bridging the Detection-to-Abstention Gap in Reasoning Models under Insufficient Information
JTS trains reasoning models via supervised warm-up and missing-premise RL to make an explicit answerability commitment that triggers early termination on unanswerable inputs, raising Abstention@Detection near saturation.