Large language models are reasoning teachers

Namgyu Ho, Laura Schmid, Se · 2023 · DOI 10.18653/v1/2023.acl-long.830

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

open at publisher browse 7 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation

cs.CL · 2025-02-28 · unverdicted · novelty 7.0

CODI compresses explicit CoT into continuous space via self-distillation and is the first implicit method to match explicit CoT performance on GSM8k at GPT-2 scale with 3.1x compression and 28.2% higher accuracy than prior implicit approaches.

Distribution Corrected Offline Data Distillation for Large Language Models

cs.CL · 2026-05-13 · unverdicted · novelty 6.0

A distribution-correction framework for offline LLM reasoning distillation improves accuracy on math benchmarks by adaptively aligning teacher supervision with the student's inference-time distribution.

Memory in the Age of AI Agents

cs.CL · 2025-12-15 · unverdicted · novelty 6.0

The paper maps agent memory research via three forms (token-level, parametric, latent), three functions (factual, experiential, working), and dynamics of formation/evolution/retrieval, plus benchmarks and future directions.

ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving

cs.CL · 2023-09-29 · conditional · novelty 6.0

ToRA trains language models on interactive tool-use trajectories with imitation learning and output shaping to integrate reasoning and external tools, yielding 13-19% gains on math datasets and new highs like 44.6% on MATH for a 7B model.

OmniThoughtVis: A Scalable Distillation Pipeline for Deployable Multimodal Reasoning Models

cs.CL · 2026-05-12 · unverdicted · novelty 5.0

OmniThoughtVis curates 1.8M multimodal CoT samples via teacher distillation, difficulty annotation, and tag-based sampling, yielding consistent gains on nine reasoning benchmarks and allowing 4B models to match or beat undistilled 8B baselines.

Long-Context Reasoning Through Proxy-Based Chain-of-Thought Tuning

cs.CL · 2026-04-06 · unverdicted · novelty 5.0 · 2 refs

ProxyCoT transfers CoT reasoning from proxy short contexts to full long contexts through RL/distillation followed by SFT, outperforming baselines with lower overhead and generalizing out-of-domain.

Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models

cs.AI · 2025-03-12 · unverdicted · novelty 5.0

The paper unifies perspectives on Long CoT in reasoning LLMs by introducing a taxonomy, detailing characteristics of deep reasoning and reflection, and discussing emergence phenomena and future directions.

citing papers explorer

Showing 7 of 7 citing papers.

CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation cs.CL · 2025-02-28 · unverdicted · none · ref 99
CODI compresses explicit CoT into continuous space via self-distillation and is the first implicit method to match explicit CoT performance on GSM8k at GPT-2 scale with 3.1x compression and 28.2% higher accuracy than prior implicit approaches.
Distribution Corrected Offline Data Distillation for Large Language Models cs.CL · 2026-05-13 · unverdicted · none · ref 16
A distribution-correction framework for offline LLM reasoning distillation improves accuracy on math benchmarks by adaptively aligning teacher supervision with the student's inference-time distribution.
Memory in the Age of AI Agents cs.CL · 2025-12-15 · unverdicted · none · ref 197
The paper maps agent memory research via three forms (token-level, parametric, latent), three functions (factual, experiential, working), and dynamics of formation/evolution/retrieval, plus benchmarks and future directions.
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving cs.CL · 2023-09-29 · conditional · none · ref 14
ToRA trains language models on interactive tool-use trajectories with imitation learning and output shaping to integrate reasoning and external tools, yielding 13-19% gains on math datasets and new highs like 44.6% on MATH for a 7B model.
OmniThoughtVis: A Scalable Distillation Pipeline for Deployable Multimodal Reasoning Models cs.CL · 2026-05-12 · unverdicted · none · ref 11
OmniThoughtVis curates 1.8M multimodal CoT samples via teacher distillation, difficulty annotation, and tag-based sampling, yielding consistent gains on nine reasoning benchmarks and allowing 4B models to match or beat undistilled 8B baselines.
Long-Context Reasoning Through Proxy-Based Chain-of-Thought Tuning cs.CL · 2026-04-06 · unverdicted · none · ref 16 · 2 links
ProxyCoT transfers CoT reasoning from proxy short contexts to full long contexts through RL/distillation followed by SFT, outperforming baselines with lower overhead and generalizing out-of-domain.
Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models cs.AI · 2025-03-12 · unverdicted · none · ref 261
The paper unifies perspectives on Long CoT in reasoning LLMs by introducing a taxonomy, detailing characteristics of deep reasoning and reflection, and discussing emergence phenomena and future directions.

Large language models are reasoning teachers

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer