Stochastic Particle Gradient Descent for Infinite Ensembles

Atsushi Nitanda; Taiji Suzuki

arxiv: 1712.05438 · v1 · pith:HOOQFRB5new · submitted 2017-12-14 · 📊 stat.ML · cs.LG· math.OC

Stochastic Particle Gradient Descent for Infinite Ensembles

Atsushi Nitanda , Taiji Suzuki This is my paper

classification 📊 stat.ML cs.LGmath.OC

keywords methodmethodsoptimizationconstraintensembleinfiniteproblemsregularization

0 comments

read the original abstract

The superior performance of ensemble methods with infinite models are well known. Most of these methods are based on optimization problems in infinite-dimensional spaces with some regularization, for instance, boosting methods and convex neural networks use $L^1$-regularization with the non-negative constraint. However, due to the difficulty of handling $L^1$-regularization, these problems require early stopping or a rough approximation to solve it inexactly. In this paper, we propose a new ensemble learning method that performs in a space of probability measures, that is, our method can handle the $L^1$-constraint and the non-negative constraint in a rigorous way. Such an optimization is realized by proposing a general purpose stochastic optimization method for learning probability measures via parameterization using transport maps on base models. As a result of running the method, a transport map to output an infinite ensemble is obtained, which forms a residual-type network. From the perspective of functional gradient methods, we give a convergence rate as fast as that of a stochastic optimization method for finite dimensional nonconvex problems. Moreover, we show an interior optimality property of a local optimality condition used in our analysis.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Thinned Mean Field Langevin Dynamics
cs.LG 2026-05 unverdicted novelty 7.0

KT-MFLD thins the particle system in mean-field Langevin dynamics to O(N^{3/2}) complexity with convergence guarantees matching standard MFLD up to logarithmic factors.
On the global convergence of gradient descent for wide shallow models with bounded nonlinearities
math.OC 2026-05 unverdicted novelty 6.0

Gradient descent on wide shallow models with bounded nonlinearities converges globally in the mean-field limit as non-global critical points are unstable under the dynamics.