Increasing mini-batch sizes in SGD under alpha-stable heavy-tailed noise yield improved L^p convergence rates, convergence in probability with constant stepsizes, and explicit stable distributional limits for the iterates and Polyak-Ruppert averages.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Bayesian optimization uses Gaussian process regression to build a surrogate model and acquisition functions to guide sampling for optimizing costly objective functions, including a new formal generalization of expected improvement to noisy evaluations.
citing papers explorer
-
Convergence of Stochastic Gradient Descent with mini-batching and infinite variance
Increasing mini-batch sizes in SGD under alpha-stable heavy-tailed noise yield improved L^p convergence rates, convergence in probability with constant stepsizes, and explicit stable distributional limits for the iterates and Polyak-Ruppert averages.
-
A Tutorial on Bayesian Optimization
Bayesian optimization uses Gaussian process regression to build a surrogate model and acquisition functions to guide sampling for optimizing costly objective functions, including a new formal generalization of expected improvement to noisy evaluations.
- Stabilised weighted data subsampling for accelerated inference in models with recursive likelihoods