pith. sign in

arxiv: 1911.13211 · v3 · pith:2E6R7AZSnew · submitted 2019-11-29 · 📊 stat.ML · cs.LG

Embedding and learning with signatures

classification 📊 stat.ML cs.LG
keywords embeddingsignaturescalledlearningalgorithmsapproachconsidereddata
0
0 comments X
read the original abstract

Sequential and temporal data arise in many fields of research, such as quantitative finance, medicine, or computer vision. A novel approach for sequential learning, called the signature method and rooted in rough path theory, is considered. Its basic principle is to represent multidimensional paths by a graded feature set of their iterated integrals, called the signature. This approach relies critically on an embedding principle, which consists in representing discretely sampled data as paths, i.e., functions from $[0,1]$ to $\mathbb{R}^d$. After a survey of machine learning methodologies for signatures, the influence of embeddings on prediction accuracy is investigated with an in-depth study of three recent and challenging datasets. It is shown that a specific embedding, called lead-lag, is systematically the strongest performer across all datasets and algorithms considered. Moreover, an empirical study reveals that computing signatures over the whole path domain does not lead to a loss of local information. It is concluded that, with a good embedding, combining signatures with other simple algorithms achieves results competitive with state-of-the-art, domain-specific approaches.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.