TCFT trains LLMs on temporal critique tasks to reduce post-cutoff knowledge leakage by 37-42 percentage points over prompting and standard SFT on Qwen models.
Double-checker: Enhancing reasoning of slow-thinking llms via self-critical fine-tuning.arXiv preprint arXiv:2506.21285, 2025
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Teaching Large Language Models When Not to Know: Learning Temporal Critique for Ex-Ante Reasoning
TCFT trains LLMs on temporal critique tasks to reduce post-cutoff knowledge leakage by 37-42 percentage points over prompting and standard SFT on Qwen models.