Chain-of-thought prompting elicits reasoning in large language models

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al · 2022

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

RMA: an Agentic System for Research-Level Mathematical Problems

cs.AI · 2026-05-20 · unverdicted · novelty 6.0

RMA, a multi-agent system with structured memory and iterative feedback loops, solves 8 out of 10 research-level math problems on the new First Proof benchmark and outperforms GPT-5.2R and Aletheia according to expert evaluation.

Confidence-Aware Alignment Makes Reasoning LLMs More Reliable

cs.AI · 2026-05-08 · unverdicted · novelty 6.0

CASPO trains LLMs via iterative direct preference optimization so that token-level confidence tracks step-wise correctness, then applies Confidence-aware Thought pruning at inference to improve both reliability and speed on reasoning benchmarks.

Evaluating the False Trust Engendered by LLM Explanations

cs.HC · 2026-05-11 · unverdicted · novelty 5.0

LLM reasoning traces and post-hoc explanations increase false trust in incorrect predictions, whereas contrastive dual explanations enhance users' ability to distinguish correct from incorrect AI outputs.

citing papers explorer

Showing 3 of 3 citing papers.

RMA: an Agentic System for Research-Level Mathematical Problems cs.AI · 2026-05-20 · unverdicted · none · ref 47
RMA, a multi-agent system with structured memory and iterative feedback loops, solves 8 out of 10 research-level math problems on the new First Proof benchmark and outperforms GPT-5.2R and Aletheia according to expert evaluation.
Confidence-Aware Alignment Makes Reasoning LLMs More Reliable cs.AI · 2026-05-08 · unverdicted · none · ref 46
CASPO trains LLMs via iterative direct preference optimization so that token-level confidence tracks step-wise correctness, then applies Confidence-aware Thought pruning at inference to improve both reliability and speed on reasoning benchmarks.
Evaluating the False Trust Engendered by LLM Explanations cs.HC · 2026-05-11 · unverdicted · none · ref 49
LLM reasoning traces and post-hoc explanations increase false trust in incorrect predictions, whereas contrastive dual explanations enhance users' ability to distinguish correct from incorrect AI outputs.

Chain-of-thought prompting elicits reasoning in large language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer