Deconstructing long chain-of-thought: A structured reasoning optimization framework for long cot distillation

Luo, Y · 2025 · arXiv 2503.16385

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

representative citing papers

Are Rationales Necessary and Sufficient? Tuning LLMs for Explainable Misinformation Detection

cs.CL · 2026-05-19 · unverdicted · novelty 6.0

LONSREX introduces a metric-based pipeline to identify necessary and sufficient rationales when creating training data for fine-tuning LLMs on explainable misinformation detection, addressing limitations of naive label-based filtering.

Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding

cs.AI · 2026-05-04 · unverdicted · novelty 6.0

CoRD uses collaborative multi-teacher step-wise decoding with perplexity-guided beam search to generate higher-quality Long-CoT data that lets smaller models reach near-teacher performance with less supervision.

On the Step Length Confounding in LLM Reasoning Data Selection

cs.CL · 2026-04-08 · unverdicted · novelty 6.0

Average log probability selection for LLM reasoning datasets is confounded by step length because longer steps dilute low-probability first tokens; ASLEC-DROP and ASLEC-CASL remove this bias.

Training-Trajectory-Aware Token Selection

cs.CL · 2026-01-15 · unverdicted · novelty 6.0

Training-Trajectory-Aware Token Selection (T3S) reconstructs the token-level training objective to overcome a performance bottleneck in continual distillation of reasoning capabilities from large to small language models.

TRUST: A Framework for Decentralized AI Service v.0.1

cs.AI · 2026-04-29 · unverdicted · novelty 5.0

TRUST is a decentralized AI auditing framework that decomposes reasoning into HDAGs, maps agent interactions via the DAAN protocol to CIGs, and uses stake-weighted multi-tier consensus to achieve 72.4% accuracy while proving a Safety-Profitability Theorem that rewards honest auditors.

Investigating Thinking Behaviours of Reasoning-Based Language Models for Social Bias Mitigation

cs.CL · 2025-10-20 · unverdicted · novelty 5.0

Reasoning LLMs aggregate social biases through stereotype repetition and irrelevant information injection in their thinking processes, and a self-review prompt mitigates this on BBQ, StereoSet, and BOLD benchmarks.

citing papers explorer

Showing 6 of 6 citing papers.

Are Rationales Necessary and Sufficient? Tuning LLMs for Explainable Misinformation Detection cs.CL · 2026-05-19 · unverdicted · none · ref 22
LONSREX introduces a metric-based pipeline to identify necessary and sufficient rationales when creating training data for fine-tuning LLMs on explainable misinformation detection, addressing limitations of naive label-based filtering.
Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding cs.AI · 2026-05-04 · unverdicted · none · ref 37
CoRD uses collaborative multi-teacher step-wise decoding with perplexity-guided beam search to generate higher-quality Long-CoT data that lets smaller models reach near-teacher performance with less supervision.
On the Step Length Confounding in LLM Reasoning Data Selection cs.CL · 2026-04-08 · unverdicted · none · ref 3
Average log probability selection for LLM reasoning datasets is confounded by step length because longer steps dilute low-probability first tokens; ASLEC-DROP and ASLEC-CASL remove this bias.
Training-Trajectory-Aware Token Selection cs.CL · 2026-01-15 · unverdicted · none · ref 13
Training-Trajectory-Aware Token Selection (T3S) reconstructs the token-level training objective to overcome a performance bottleneck in continual distillation of reasoning capabilities from large to small language models.
TRUST: A Framework for Decentralized AI Service v.0.1 cs.AI · 2026-04-29 · unverdicted · none · ref 26
TRUST is a decentralized AI auditing framework that decomposes reasoning into HDAGs, maps agent interactions via the DAAN protocol to CIGs, and uses stake-weighted multi-tier consensus to achieve 72.4% accuracy while proving a Safety-Profitability Theorem that rewards honest auditors.
Investigating Thinking Behaviours of Reasoning-Based Language Models for Social Bias Mitigation cs.CL · 2025-10-20 · unverdicted · none · ref 11
Reasoning LLMs aggregate social biases through stereotype repetition and irrelevant information injection in their thinking processes, and a self-review prompt mitigates this on BBQ, StereoSet, and BOLD benchmarks.

Deconstructing long chain-of-thought: A structured reasoning optimization framework for long cot distillation

fields

years

verdicts

representative citing papers

citing papers explorer