Sdrt: Enhance vision- language models by self-distillation with diverse reasoning traces

Guande Wu, Huan Song, Yawei Wang, Qiaojing Yan, Yijun Tian, Lin Lee Cheong, Panpan Xu · 2025 · arXiv 2503.01754

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

Boosting Omni-Modal Language Models: Staged Post-Training with Visually Debiased Evaluation

cs.MM · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

Visual debiasing of omni-modal benchmarks combined with staged post-training lets a 3B model match or exceed a 30B model without a stronger teacher.

Hide to See: Reasoning-prefix Masking for Visual-anchored Thinking in VLM Distillation

cs.CV · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

A reasoning-prefix masking strategy during VLM distillation encourages students to anchor their thinking on visual evidence, yielding better multimodal reasoning than prior distillation baselines.

citing papers explorer

Showing 2 of 2 citing papers.

Boosting Omni-Modal Language Models: Staged Post-Training with Visually Debiased Evaluation cs.MM · 2026-05-12 · unverdicted · none · ref 14 · 2 links
Visual debiasing of omni-modal benchmarks combined with staged post-training lets a 3B model match or exceed a 30B model without a stronger teacher.
Hide to See: Reasoning-prefix Masking for Visual-anchored Thinking in VLM Distillation cs.CV · 2026-05-12 · unverdicted · none · ref 31 · 2 links
A reasoning-prefix masking strategy during VLM distillation encourages students to anchor their thinking on visual evidence, yielding better multimodal reasoning than prior distillation baselines.

Sdrt: Enhance vision- language models by self-distillation with diverse reasoning traces

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer