Journal of Applied Meteorology , volume=

A New Vector Partition of the Probability Score , author=

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

Is Capability a Liability? More Capable Language Models Make Worse Forecasts When It Matters Most

cs.AI · 2026-05-21 · unverdicted · novelty 7.0 · 2 refs

More capable LLMs produce worse distributional forecasts on superlinear growth time series with tail risks of regime change, with the error concentrated in the upper tail; this reverses on conventional threshold metrics.

TRIAGE: Evaluating Prospective Metacognitive Control in LLMs under Resource Constraints

cs.AI · 2026-05-13 · unverdicted · novelty 7.0

TRIAGE evaluates LLMs on prospective metacognitive control by requiring a single plan for task selection, sequencing, and token allocation under a calibrated budget, revealing substantial gaps in current models across math, science, code, and knowledge tasks.

The Manokhin Probability Matrix: A Diagnostic Framework for Classifier Probability Quality

stat.ML · 2026-05-05 · unverdicted · novelty 6.0

A new 2x2 diagnostic matrix classifies probabilistic classifiers into Eagles, Bulls, Sloths, and Moles by calibration and discrimination, with empirical archetype assignments and a proof that post-hoc calibration cannot add discriminatory power.

citing papers explorer

Showing 3 of 3 citing papers.

Is Capability a Liability? More Capable Language Models Make Worse Forecasts When It Matters Most cs.AI · 2026-05-21 · unverdicted · none · ref 17 · 2 links
More capable LLMs produce worse distributional forecasts on superlinear growth time series with tail risks of regime change, with the error concentrated in the upper tail; this reverses on conventional threshold metrics.
TRIAGE: Evaluating Prospective Metacognitive Control in LLMs under Resource Constraints cs.AI · 2026-05-13 · unverdicted · none · ref 27
TRIAGE evaluates LLMs on prospective metacognitive control by requiring a single plan for task selection, sequencing, and token allocation under a calibrated budget, revealing substantial gaps in current models across math, science, code, and knowledge tasks.
The Manokhin Probability Matrix: A Diagnostic Framework for Classifier Probability Quality stat.ML · 2026-05-05 · unverdicted · none · ref 3
A new 2x2 diagnostic matrix classifies probabilistic classifiers into Eagles, Bulls, Sloths, and Moles by calibration and discrimination, with empirical archetype assignments and a proof that post-hoc calibration cannot add discriminatory power.

Journal of Applied Meteorology , volume=

fields

years

verdicts

representative citing papers

citing papers explorer