pith. sign in

‘A Problem in Probability’

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

years

2026 2

verdicts

UNVERDICTED 2

representative citing papers

How reliable are LLMs when it comes to playing dice?

cs.CL · 2026-06-05 · unverdicted · novelty 5.0

LLMs score 0.96 on standard probability exercises but 0.59 on counterintuitive ones and drop further with biased wording or misleading cues, indicating they are not genuine probabilistic reasoners.

Counterintuitive problems in discrete probability

math.PR · 2026-06-05 · unverdicted · novelty 2.0

A curated dataset of counterintuitive discrete probability problems with human solutions, built to benchmark LLM reasoning on bias-prone tasks.

citing papers explorer

Showing 2 of 2 citing papers.

  • How reliable are LLMs when it comes to playing dice? cs.CL · 2026-06-05 · unverdicted · none · ref 20

    LLMs score 0.96 on standard probability exercises but 0.59 on counterintuitive ones and drop further with biased wording or misleading cues, indicating they are not genuine probabilistic reasoners.

  • Counterintuitive problems in discrete probability math.PR · 2026-06-05 · unverdicted · none · ref 10

    A curated dataset of counterintuitive discrete probability problems with human solutions, built to benchmark LLM reasoning on bias-prone tasks.