Marco-o1 v2: Towards Widening The Distillation Bottleneck for Reasoning Models

Yin, Huifeng, Zhao, Yu, Wu, Minghao, Ni, Xuanfan, Zeng, Bo, Wang, Hao · 2025 · DOI 10.18653/v1/2025.acl-long.1145

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

cs.LG · 2026-06-25 · unverdicted · novelty 3.0

Trains and releases SAEs for Qwen3-1.7B/4B/8B models with layer-wise coverage and demonstrates causal steering of refusal via selected features.

Showing 1 of 1 citing paper.

Discovering Millions of Interpretable Features with Sparse Autoencoders cs.LG · 2026-06-25 · unverdicted · none · ref 58
Trains and releases SAEs for Qwen3-1.7B/4B/8B models with layer-wise coverage and demonstrates causal steering of refusal via selected features.