Deep Learning as the Disciplined Construction of Tame Objects
read the original abstract
One can see deep-learning models as compositions of functions within the so-called tame geometry. In this expository note, we give an overview of some topics at the interface of tame geometry (also known as o-minimality), optimization theory, and deep learning theory and practice. To do so, we gradually introduce the concepts and tools used to build convergence guarantees for stochastic gradient descent in a general nonsmooth nonconvex, but tame, setting. This illustrates some ways in which tame geometry is a natural mathematical framework for the study of AI systems, especially within Deep Learning.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
On convergence rates of subgradient descent on semialgebraic functions
Under Lipschitz stratification assumptions that hold automatically for semialgebraic functions, constant-step subgradient descent achieves explicit rates that improve with fewer strata and recover smooth-case rates up...
-
Fast approximation and learning of binary classification tasks in o-minimal structures using ReLU neural networks
ReLU networks approximate traceable definable subsets of the unit cube in L^p with size O(ε^{-p(n-1)/m}) and yield ERM learning rates of order N^{-m/(m+pn-p)} for hinge loss under uniform component bounds.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.