Adaswitch: Adaptive switching genera- tion for knowledge distillation.CoRR, abs/2510.07842

Peng, J · 2025 · arXiv 2510.07842

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data

cs.CL · 2026-03-23 · conditional · novelty 7.0

TESSY creates stylistically consistent synthetic data via teacher-student token interleaving, yielding 11.25% and 6.68% gains on code benchmarks where pure teacher data causes 3.25% and 10.02% drops.

Reasoning Can Be Restored by Correcting a Few Decision Tokens

cs.AI · 2026-05-16 · conditional · novelty 6.0

Reasoning gaps between base LLMs and LRMs concentrate on ~8% of early planning tokens; intervening with the reasoning model only at high-disagreement positions recovers performance.

Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment

cs.CL · 2026-01-20

citing papers explorer

Showing 1 of 1 citing paper after filters.

Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment cs.CL · 2026-01-20 · unreviewed · ref 32

Adaswitch: Adaptive switching genera- tion for knowledge distillation.CoRR, abs/2510.07842

fields

years

verdicts

representative citing papers

citing papers explorer