Double-checker: Enhancing reasoning of slow-thinking llms via self-critical fine-tuning.arXiv preprint arXiv:2506.21285, 2025

Xin Xu, Tianhao Chen, Fan Zhang, Wanlong Liu, Pengxiang Li, Ajay Kumar Jaiswal, Yuchen Yan, Jishan Hu, Yang Wang, Hao Chen, et al · 2025 · arXiv 2506.21285

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

Teaching Large Language Models When Not to Know: Learning Temporal Critique for Ex-Ante Reasoning

cs.AI · 2026-05-14 · unverdicted · novelty 5.0

TCFT trains LLMs on temporal critique tasks to reduce post-cutoff knowledge leakage by 37-42 percentage points over prompting and standard SFT on Qwen models.

citing papers explorer

Showing 1 of 1 citing paper.

Teaching Large Language Models When Not to Know: Learning Temporal Critique for Ex-Ante Reasoning cs.AI · 2026-05-14 · unverdicted · none · ref 47
TCFT trains LLMs on temporal critique tasks to reduce post-cutoff knowledge leakage by 37-42 percentage points over prompting and standard SFT on Qwen models.

Double-checker: Enhancing reasoning of slow-thinking llms via self-critical fine-tuning.arXiv preprint arXiv:2506.21285, 2025

fields

years

verdicts

representative citing papers

citing papers explorer