pith. sign in

Journal of the American Statistical Association , volume=

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.RO 1 cs.SE 1

years

2026 2

verdicts

UNVERDICTED 2

representative citing papers

Computer Use at the Edge of the Statistical Precipice

cs.SE · 2026-05-07 · unverdicted · novelty 6.0

A blind replay script matches frontier model performance on static CUA benchmarks due to non-principled environments and evaluation methods, prompting PRISM design principles and the DigiWorld benchmark with improved statistical aggregation.

Practical validation of synthetic pre-crash scenarios

cs.RO · 2026-05-06 · unverdicted · novelty 6.0

A binning-based Bayesian ROPE equivalence testing method is introduced to quantitatively assess practical equivalence between synthetic and real pre-crash scenario datasets for driving automation safety impact evaluation.

citing papers explorer

Showing 2 of 2 citing papers.

  • Computer Use at the Edge of the Statistical Precipice cs.SE · 2026-05-07 · unverdicted · none · ref 18

    A blind replay script matches frontier model performance on static CUA benchmarks due to non-principled environments and evaluation methods, prompting PRISM design principles and the DigiWorld benchmark with improved statistical aggregation.

  • Practical validation of synthetic pre-crash scenarios cs.RO · 2026-05-06 · unverdicted · none · ref 83

    A binning-based Bayesian ROPE equivalence testing method is introduced to quantitatively assess practical equivalence between synthetic and real pre-crash scenario datasets for driving automation safety impact evaluation.