Advances in Neural Information Processing Systems , year =

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

Pause and Reflect: Conformal Aggregation for Chain-of-Thought Reasoning

stat.ML · 2026-05-13 · unverdicted · novelty 6.0

A conformal procedure for CoT replaces majority voting with weighted aggregation and calibrates abstention to guarantee low confident-error rates, achieving 90.1% selective accuracy on GSM8K by abstaining on under 5% of cases.

Targeted Tests for LLM Reasoning: An Audit-Constrained Protocol

cs.LG · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

Presents an audit-constrained protocol for targeted LLM reasoning evaluation using component grammar prompt variants and shows that Component-Adaptive Prompt Sampling does not outperform uniform sampling in audited yield.

Language models fail at extended rule following

cs.CL · 2026-05-03 · unverdicted · novelty 5.0

LLMs fail at extended counting of repeated characters due to finite internal states, with abrupt errors persisting across model scales and inference methods.

citing papers explorer

Showing 3 of 3 citing papers.

Pause and Reflect: Conformal Aggregation for Chain-of-Thought Reasoning stat.ML · 2026-05-13 · unverdicted · none · ref 23
A conformal procedure for CoT replaces majority voting with weighted aggregation and calibrates abstention to guarantee low confident-error rates, achieving 90.1% selective accuracy on GSM8K by abstaining on under 5% of cases.
Targeted Tests for LLM Reasoning: An Audit-Constrained Protocol cs.LG · 2026-05-12 · unverdicted · none · ref 4 · 2 links
Presents an audit-constrained protocol for targeted LLM reasoning evaluation using component grammar prompt variants and shows that Component-Adaptive Prompt Sampling does not outperform uniform sampling in audited yield.
Language models fail at extended rule following cs.CL · 2026-05-03 · unverdicted · none · ref 19
LLMs fail at extended counting of repeated characters due to finite internal states, with abrupt errors persisting across model scales and inference methods.

Advances in Neural Information Processing Systems , year =

fields

years

verdicts

representative citing papers

citing papers explorer