TESSY creates stylistically consistent synthetic data via teacher-student token interleaving, yielding 11.25% and 6.68% gains on code benchmarks where pure teacher data causes 3.25% and 10.02% drops.
Gonzalez, Hao Zhang, and Ion Stoica
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3years
2026 3verdicts
CONDITIONAL 3representative citing papers
Rank-Surprisal Ratio (RSR) correlates strongly (average Spearman 0.86) with post-distillation reasoning gains across five student models and trajectories from eleven teachers, outperforming existing selection metrics.
OpenCompass is a modular, high-concurrency platform for unified LLM evaluation across knowledge, reasoning, code, and other domains with support for rule-based, LLM-as-judge, and cascaded evaluators.
citing papers explorer
-
How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data
TESSY creates stylistically consistent synthetic data via teacher-student token interleaving, yielding 11.25% and 6.68% gains on code benchmarks where pure teacher data causes 3.25% and 10.02% drops.
-
Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment
Rank-Surprisal Ratio (RSR) correlates strongly (average Spearman 0.86) with post-distillation reasoning gains across five student models and trajectories from eleven teachers, outperforming existing selection metrics.
-
OpenCompass: A Universal Evaluation Platform for Large Language Models
OpenCompass is a modular, high-concurrency platform for unified LLM evaluation across knowledge, reasoning, code, and other domains with support for rule-based, LLM-as-judge, and cascaded evaluators.