Unveiling the key factors for distilling chain-of-thought reasoning

Xinghao Chen, Zhijing Sun, Guo Wenjin, Miaoran Zhang, Yanjun Chen, Yirong Sun, Hui Su, Yijie Pan, Dietrich Klakow, Wenjie Li, Xiaoyu Shen · 2025 · DOI 10.18653/v1/2025.findings-acl.782

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

open at publisher browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Decoupling KL and Trajectories: A Unified Perspective for SFT, DAgger, Offline RL, and OPD in LLM Distillation

cs.LG · 2026-05-16 · unverdicted · novelty 7.0

Decoupling prefix source from token-level KL direction in autoregressive sequence KL yields four objectives unifying SFT, DAgger, offline RL and OPD, with KL mixing and entropy-gated curriculum improving math reasoning accuracy and shortening responses.

Distribution Corrected Offline Data Distillation for Large Language Models

cs.CL · 2026-05-13 · unverdicted · novelty 6.0

A distribution-correction framework for offline LLM reasoning distillation improves accuracy on math benchmarks by adaptively aligning teacher supervision with the student's inference-time distribution.

Emergent Manifold Separability during Reasoning in Large Language Models

cs.LG · 2026-02-23 · unverdicted · novelty 6.0

Reasoning in LLMs produces a transient geometric pulse in which concept manifolds untangle into linearly separable subspaces immediately before computation and compress afterward.

GateKD: Confidence-Gated Closed-Loop Distillation for Robust Reasoning

cs.CL · 2026-05-13

citing papers explorer

Showing 4 of 4 citing papers.

Decoupling KL and Trajectories: A Unified Perspective for SFT, DAgger, Offline RL, and OPD in LLM Distillation cs.LG · 2026-05-16 · unverdicted · none · ref 5
Decoupling prefix source from token-level KL direction in autoregressive sequence KL yields four objectives unifying SFT, DAgger, offline RL and OPD, with KL mixing and entropy-gated curriculum improving math reasoning accuracy and shortening responses.
Distribution Corrected Offline Data Distillation for Large Language Models cs.CL · 2026-05-13 · unverdicted · none · ref 6
A distribution-correction framework for offline LLM reasoning distillation improves accuracy on math benchmarks by adaptively aligning teacher supervision with the student's inference-time distribution.
Emergent Manifold Separability during Reasoning in Large Language Models cs.LG · 2026-02-23 · unverdicted · none · ref 5
Reasoning in LLMs produces a transient geometric pulse in which concept manifolds untangle into linearly separable subspaces immediately before computation and compress afterward.
GateKD: Confidence-Gated Closed-Loop Distillation for Robust Reasoning cs.CL · 2026-05-13 · unreviewed · ref 20

Unveiling the key factors for distilling chain-of-thought reasoning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer