Uni-OPD unifies on-policy distillation across LLMs and MLLMs with dual-perspective strategies that promote student exploration and enforce order-consistent teacher supervision based on outcome rewards.
Lyra: An efficient and speech-centric framework for omni-cognition.arXiv preprint arXiv:2412.09501
2 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Qwen2.5-Omni presents a multimodal model with block-wise encoders, TMRoPE position embeddings, and a Thinker-Talker architecture that enables simultaneous text and streaming speech generation while matching text performance on reasoning benchmarks.
citing papers explorer
-
Uni-OPD: Unifying On-Policy Distillation with a Dual-Perspective Recipe
Uni-OPD unifies on-policy distillation across LLMs and MLLMs with dual-perspective strategies that promote student exploration and enforce order-consistent teacher supervision based on outcome rewards.
-
Qwen2.5-Omni Technical Report
Qwen2.5-Omni presents a multimodal model with block-wise encoders, TMRoPE position embeddings, and a Thinker-Talker architecture that enables simultaneous text and streaming speech generation while matching text performance on reasoning benchmarks.