Machine Learning , year =

Breiman, Leo , title = · 1996 · DOI 10.1007/bf00058655

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

representative citing papers

Reducing cross-sample prediction churn in scientific machine learning

cs.LG · 2026-05-13 · accept · novelty 7.0

Cross-sample prediction churn between bootstrap-trained classifiers reaches 8-22% on chemistry benchmarks; K-bootstrap bagging reduces it 40-54% and twin-bootstrap with sym-KL consistency loss reduces it a further median 45% at matched 2x compute.

LLM-guided Semi-Supervised Approaches for Social Media Crisis Data Classification

cs.AI · 2026-05-08 · conditional · novelty 7.0

LG-CoTrain, an LLM-guided co-training method, outperforms classical semi-supervised baselines for crisis tweet classification in low-resource settings with 5-25 labeled examples per class.

Ensemble Monitoring for AI Control: Diverse Signals Outweigh More Compute

cs.AI · 2026-05-14 · unverdicted · novelty 6.0

Diverse ensembles of prompted and fine-tuned GPT-4.1-Mini monitors achieve 2.4x better detection of flawed code solutions than homogeneous ensembles on adversarial inputs.

citing papers explorer

Showing 3 of 3 citing papers.

Reducing cross-sample prediction churn in scientific machine learning cs.LG · 2026-05-13 · accept · none · ref 3
Cross-sample prediction churn between bootstrap-trained classifiers reaches 8-22% on chemistry benchmarks; K-bootstrap bagging reduces it 40-54% and twin-bootstrap with sym-KL consistency loss reduces it a further median 45% at matched 2x compute.
LLM-guided Semi-Supervised Approaches for Social Media Crisis Data Classification cs.AI · 2026-05-08 · conditional · none · ref 125
LG-CoTrain, an LLM-guided co-training method, outperforms classical semi-supervised baselines for crisis tweet classification in low-resource settings with 5-25 labeled examples per class.
Ensemble Monitoring for AI Control: Diverse Signals Outweigh More Compute cs.AI · 2026-05-14 · unverdicted · none · ref 34
Diverse ensembles of prompted and fine-tuned GPT-4.1-Mini monitors achieve 2.4x better detection of flawed code solutions than homogeneous ensembles on adversarial inputs.

Machine Learning , year =

fields

years

verdicts

representative citing papers

citing papers explorer