Random Forests: some methodological insights

· 2008 · stat.ML · arXiv 0811.3619

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

This paper examines from an experimental perspective random forests, the increasingly used statistical method for classification and regression problems introduced by Leo Breiman in 2001. It first aims at confirming, known but sparse, advice for using random forests and at proposing some complementary remarks for both standard problems as well as high dimensional ones for which the number of variables hugely exceeds the sample size. But the main contribution of this paper is twofold: to provide some insights about the behavior of the variable importance index based on random forests and in addition, to propose to investigate two classical issues of variable selection. The first one is to find important variables for interpretation and the second one is more restrictive and try to design a good prediction model. The strategy involves a ranking of explanatory variables using the random forests score of importance and a stepwise ascending variable introduction strategy.

representative citing papers

A Stationary-Distribution Theory for Triplet-Based Plateau Search in Random Forest Ensemble-Size Selection

cs.LG · 2026-06-29 · unverdicted · novelty 6.0

Derives the stationary distribution and asymptotic scaling O(ε^{-2}) for ensemble size in a Markov chain model of triplet-based plateau tuning for random forests.

How Many Trees in a Random Forest? A Revisited Approach with Plateau Search and Optuna Integration

cs.LG · 2026-06-02 · conditional · novelty 6.0

A triplet-based plateau search algorithm is proposed to adaptively determine a near-minimal number of trees for random forests by monitoring relative OOB score changes across forest size triplets, removing n_trees from the TPE search space.

citing papers explorer

Showing 2 of 2 citing papers.

A Stationary-Distribution Theory for Triplet-Based Plateau Search in Random Forest Ensemble-Size Selection cs.LG · 2026-06-29 · unverdicted · none · ref 10 · internal anchor
Derives the stationary distribution and asymptotic scaling O(ε^{-2}) for ensemble size in a Markov chain model of triplet-based plateau tuning for random forests.
How Many Trees in a Random Forest? A Revisited Approach with Plateau Search and Optuna Integration cs.LG · 2026-06-02 · conditional · none · ref 57 · internal anchor
A triplet-based plateau search algorithm is proposed to adaptively determine a near-minimal number of trees for random forests by monitoring relative OOB score changes across forest size triplets, removing n_trees from the TPE search space.

Random Forests: some methodological insights

fields

years

verdicts

representative citing papers

citing papers explorer