pith. sign in

Symbolic Chain-of-Thought Distillation: Small Models Can Also "Think" Step-by-Step , booktitle =

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

fields

cs.CL 3 cs.AI 1

years

2026 3 2024 1

clear filters

representative citing papers

Step-by-Step Optimization-like Reasoning in LLMs over Expanding Search Spaces

cs.AI · 2026-06-03 · unverdicted · novelty 7.0

Introduces OPT* tasks and two training regimes (solver-guided online policy optimization with rank-based reward shaping and search-based offline RL) plus a theoretical link between search success and information extraction per budget unit, showing empirical gains in optimization-like reasoning.

citing papers explorer

Showing 3 of 3 citing papers after filters.

  • Step-by-Step Optimization-like Reasoning in LLMs over Expanding Search Spaces cs.AI · 2026-06-03 · unverdicted · none · ref 12

    Introduces OPT* tasks and two training regimes (solver-guided online policy optimization with rank-based reward shaping and search-based offline RL) plus a theoretical link between search success and information extraction per budget unit, showing empirical gains in optimization-like reasoning.

  • MTA: Multi-Granular Trajectory Alignment for Large Language Model Distillation cs.CL · 2026-05-02 · unverdicted · none · ref 106

    MTA is a distillation method that aligns teacher-student LLM representations along their transformation trajectories using layer-adaptive granularities and dynamic structural plus hidden representation alignment losses.

  • Long-Context Reasoning Through Proxy-Based Chain-of-Thought Tuning cs.CL · 2026-04-06 · unverdicted · none · ref 15 · 2 links

    ProxyCoT transfers CoT reasoning from proxy short contexts to full long contexts through RL/distillation followed by SFT, outperforming baselines with lower overhead and generalizing out-of-domain.