Yihan Cao, Yanbin Kang, Zhengming Xing, and Ruijie Jiang

URLhttps://arxiv · 2025 · arXiv 2603.24596

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Uni-OPD: Unifying On-Policy Distillation with a Dual-Perspective Recipe

cs.LG · 2026-05-05 · unverdicted · novelty 6.0

Uni-OPD unifies on-policy distillation across LLMs and MLLMs with dual-perspective strategies that promote student exploration and enforce order-consistent teacher supervision based on outcome rewards.

Minimizing Modality Gap from the Input Side: Your Speech LLM Can Be a Prosody-Aware Text LLM

cs.CL · 2026-05-07 · unverdicted · novelty 5.0 · 2 refs

TextPro-SLM reduces the speech-text modality gap by feeding an LLM backbone with synchronized text tokens and prosody embeddings from WhisperPro, achieving lowest gap scores at 3B/7B scales with roughly 1,000 hours of audio.

A Survey of On-Policy Distillation for Large Language Models

cs.LG · 2026-04-01 · unverdicted · novelty 3.0 · 2 refs

A survey that formalizes on-policy distillation as f-divergence minimization over student-sampled trajectories and organizes the literature along three design axes while linking it to KL-constrained RL.

Prefix Teach, Suffix Fade: Local Teachability Collapse in Strong-to-Weak On-Policy Distillation

cs.CL · 2026-05-13

citing papers explorer

Showing 4 of 4 citing papers.

Uni-OPD: Unifying On-Policy Distillation with a Dual-Perspective Recipe cs.LG · 2026-05-05 · unverdicted · none · ref 7
Uni-OPD unifies on-policy distillation across LLMs and MLLMs with dual-perspective strategies that promote student exploration and enforce order-consistent teacher supervision based on outcome rewards.
Minimizing Modality Gap from the Input Side: Your Speech LLM Can Be a Prosody-Aware Text LLM cs.CL · 2026-05-07 · unverdicted · none · ref 32 · 2 links
TextPro-SLM reduces the speech-text modality gap by feeding an LLM backbone with synchronized text tokens and prosody embeddings from WhisperPro, achieving lowest gap scores at 3B/7B scales with roughly 1,000 hours of audio.
A Survey of On-Policy Distillation for Large Language Models cs.LG · 2026-04-01 · unverdicted · none · ref 1 · 2 links
A survey that formalizes on-policy distillation as f-divergence minimization over student-sampled trajectories and organizes the literature along three design axes while linking it to KL-constrained RL.
Prefix Teach, Suffix Fade: Local Teachability Collapse in Strong-to-Weak On-Policy Distillation cs.CL · 2026-05-13 · unreviewed · ref 7

Yihan Cao, Yanbin Kang, Zhengming Xing, and Ruijie Jiang

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer