Note on the sampling error of the difference between correlated proportions or percentages,

· 1947

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

PlotChain: Deterministic Checkpointed Evaluation of Multimodal LLMs on Engineering Plot Reading

cs.AI · 2026-01-29 · conditional · novelty 7.0

PlotChain benchmark reports top MLLMs reaching ~80% field-level accuracy on engineering plot reading under human-like tolerances, but with persistent failures on frequency-domain tasks like bandpass and FFT spectra.

Will It Break in Production? Metric-Driven Prediction of Residual Defects in Python Systems

cs.SE · 2026-04-29 · unverdicted · novelty 4.0

Supervised models using 83 metrics achieve 0.85-0.9 recall for post-release Python faults, outperforming LLMs, with process metrics and code size most predictive and metrics plus embeddings capturing complementary information.

citing papers explorer

Showing 2 of 2 citing papers.

PlotChain: Deterministic Checkpointed Evaluation of Multimodal LLMs on Engineering Plot Reading cs.AI · 2026-01-29 · conditional · none · ref 27
PlotChain benchmark reports top MLLMs reaching ~80% field-level accuracy on engineering plot reading under human-like tolerances, but with persistent failures on frequency-domain tasks like bandpass and FFT spectra.
Will It Break in Production? Metric-Driven Prediction of Residual Defects in Python Systems cs.SE · 2026-04-29 · unverdicted · none · ref 30
Supervised models using 83 metrics achieve 0.85-0.9 recall for post-release Python faults, outperforming LLMs, with process metrics and code size most predictive and metrics plus embeddings capturing complementary information.

Note on the sampling error of the difference between correlated proportions or percentages,

fields

years

verdicts

representative citing papers

citing papers explorer