The Volterra signature

Fabian N. Harang; Luca Pelizzari; Paul P. Hager; Samy Tindel

arxiv: 2603.04525 · v2 · pith:2EZHO36Onew · submitted 2026-03-04 · 📊 stat.ML · cs.LG

The Volterra signature

Paul P. Hager , Fabian N. Harang , Luca Pelizzari , Samy Tindel This is my paper

Pith reviewed 2026-05-22 11:31 UTC · model grok-4.3

classification 📊 stat.ML cs.LG

keywords Volterra signaturepath signatureuniversal approximationkernel methodstensor algebranon-Markovian time seriesinjectivity

0 comments

The pith

The Volterra signature is an injective kernel-weighted feature map that yields universal approximation by linear functionals on path space.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the Volterra signature by weighting an input path with a temporal kernel and developing the result into the tensor algebra. It uses the Volterra-Chen identity to prove injectivity under path augmentation, which implies a universal approximation theorem for functions on infinite-dimensional path space. In certain cases this approximation is achieved simply by linear functionals of the signature. The construction also supports a kernel trick via a two-parameter integral equation for the inner product and reduces to a linear ODE for exponential kernels while remaining invariant to time reparameterization.

Core claim

By embedding the kernel-weighted input path into the tensor algebra, the Volterra signature satisfies the Volterra-Chen identity and thereby establishes injectivity on augmented paths together with a universal approximation theorem on path space that linear functionals attain in some cases.

What carries the argument

The Volterra signature VSig(x;K) defined via development of the kernel-weighted path into the tensor algebra, together with the Volterra-Chen identity that supplies the injectivity and approximation guarantees.

If this is right

Linear functionals of the Volterra signature achieve universal approximation on path space for certain kernels.
The inner product between Volterra signatures is given by a closed two-parameter integral equation that permits PDE-based numerical methods.
For exponential-type kernels the signature evolves according to a linear state-space ODE in the tensor algebra.
The signature remains invariant under time reparameterization.
It improves performance relative to classical path signature baselines on dynamic learning tasks with real and synthetic data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Extending the admissible kernels beyond the exponential family that admit ODE representations could enlarge the range of usable temporal weightings.
The injectivity result may clarify identifiability questions arising in other signature-based models for time series.
Combining the ODE representation with numerical schemes for the integral equation could produce efficient long-horizon implementations.

Load-bearing premise

The Volterra-Chen identity and the injectivity and approximation results that follow from it hold for the selected class of temporal kernels K.

What would settle it

Exhibiting two distinct augmented paths whose Volterra signatures coincide for a fixed kernel K would disprove the injectivity statement.

Figures

Figures reproduced from arXiv: 2603.04525 by Fabian N. Harang, Luca Pelizzari, Paul P. Hager, Samy Tindel.

**Figure 4.1.** Figure 4.1: Volterra signature vs. classical signature expansions, compared with the fractional SDE solution (4.2) for one testing sample. The models were trained with M = 900 training samples and N = 500 timesteps on [0, 1] Parameters: Y0 = 1, b0 = 0, b1 = −1, σ0 = 1, σ1 = 0.5, signature truncation L = 5. . • Sig (K = Id): Expanding classical SDE solution with iterated integrals of timeaugmented Brownian motion … view at source ↗

**Figure 4.2.** Figure 4.2: Next-day forecast of realized S&P 500 volatility using our method VSig, compared with the benchmark HAR, on the test set. The lower subplot shows the absolute forecast errors |yb− y| for both methods, where y denotes the realized volatility. of (4.1) when using the Volterra signature features VSig(ˆx; kλ,α). Here, the parameters (λ1, λ2, α1, α2) are treated as hyperparameters and are meant to capture lon… view at source ↗

**Figure 4.3.** Figure 4.3: Left: Coefficient of determination R2 as a (linearly interpolated) function of the past-window size p (days), reported on the training and test sets for the methods VSig and Sig, and the (constant) benchmark HAR. Right: Scatter plots of realized volatility y versus predictions yb for our method VSig and the HAR benchmark. where xˆ|Wp i denotes the piecewise linear interpolation of xˆ restricted to the p… view at source ↗

read the original abstract

Modern approaches for learning from non-Markovian time series, such as recurrent neural networks, neural controlled differential equations or transformers, typically rely on implicit memory mechanisms that can be difficult to interpret or to train over long horizons. We propose the \emph{Volterra signature} $\mathrm{VSig}(x;K)$ as a principled, explicit feature representation for history-dependent systems. By developing the input path $x$ weighted by a temporal kernel $K$ into the tensor algebra, we leverage the associated Volterra--Chen identity to derive rigorous learning-theoretic guarantees. Specifically, we prove an \emph{injectivity} statement (identifiability under augmentation) that leads to a \emph{universal approximation} theorem on the infinite dimensional path space, which in certain cases is achieved by \emph{linear functionals} of $\mathrm{VSig}(x;K)$. Moreover, we demonstrate applicability of the \emph{kernel trick} by showing that the inner product associated with Volterra signatures admits a closed characterization via a two-parameter integral equation, enabling numerical methods from PDEs for computation. For a large class of exponential-type kernels, $\mathrm{VSig}(x;K)$ solves a linear state-space ODE in the tensor algebra. Combined with inherent invariance to time reparameterization, these results position the Volterra signature as a robust, computationally tractable feature map for data science. We demonstrate its efficacy in dynamic learning tasks on real and synthetic data, where it consistently improves classical path signature baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The Volterra signature weights the path signature by a temporal kernel and attaches injectivity plus universal approximation claims for certain kernels, along with a PDE inner-product trick.

read the letter

The main takeaway is that this paper defines a kernel-weighted version of the path signature, called the Volterra signature, and uses the Volterra-Chen identity to prove injectivity under augmentation and a universal approximation result on path space. For some kernels it even reduces to linear functionals. They also give a closed-form inner product via a two-parameter integral equation and show that exponential-type kernels turn the whole thing into a linear ODE in the tensor algebra. Experiments on real and synthetic data report better performance than plain path signatures in dynamic tasks.

Referee Report

1 major / 1 minor

Summary. The paper introduces the Volterra signature VSig(x;K), obtained by developing an input path x weighted by a temporal kernel K into the tensor algebra. It uses the Volterra-Chen identity to establish an injectivity result (identifiability under augmentation) that implies a universal approximation theorem on infinite-dimensional path space, with linear functionals of VSig(x;K) sufficing in some cases. Additional results include a closed-form characterization of the associated inner product via a two-parameter integral equation (enabling PDE-based computation), a linear state-space ODE representation in the tensor algebra for a large class of exponential-type kernels, and time-reparameterization invariance. The method is positioned as an explicit, interpretable feature map for history-dependent systems and is tested on dynamic learning tasks against path-signature baselines.

Significance. If the injectivity and approximation results hold with the stated scope, the work supplies a principled explicit alternative to implicit memory mechanisms in recurrent models or neural CDEs, together with computational tools (ODE/PDE) and invariance properties that could aid long-horizon time-series tasks. The explicit link between kernel choice, linear ODEs, and approximation guarantees is a potential strength for interpretability.

major comments (1)

[Abstract and kernel-class definition] Abstract and the section defining the kernel class: the injectivity, Volterra-Chen identity, and universal approximation theorems are asserted for 'a large class of exponential-type kernels' that admit a linear ODE representation, yet no explicit necessary and sufficient conditions on K (analyticity, decay rate, positivity, or other regularity) are supplied. Because the central claims rest on the identity holding inside this class, the lack of a precise delineation is load-bearing for the scope of the identifiability and approximation guarantees.

minor comments (1)

[Experiments] The data experiments are summarized at a high level; adding concrete details on dataset sizes, exact baselines, hyper-parameter selection, and statistical significance testing would improve reproducibility and allow readers to assess the practical improvement over path signatures.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback on the manuscript. We address the major comment below and have revised the paper to provide a clearer delineation of the kernel class.

read point-by-point responses

Referee: [Abstract and kernel-class definition] Abstract and the section defining the kernel class: the injectivity, Volterra-Chen identity, and universal approximation theorems are asserted for 'a large class of exponential-type kernels' that admit a linear ODE representation, yet no explicit necessary and sufficient conditions on K (analyticity, decay rate, positivity, or other regularity) are supplied. Because the central claims rest on the identity holding inside this class, the lack of a precise delineation is load-bearing for the scope of the identifiability and approximation guarantees.

Authors: We agree that the original presentation did not supply explicit necessary and sufficient conditions on K, which leaves the precise scope of the injectivity and approximation results somewhat implicit. In the revised manuscript we have added a dedicated subsection (Section 2.3) that characterizes the admissible kernels. A kernel K belongs to the class if and only if the associated Volterra operator admits a finite-dimensional linear state-space realization in the tensor algebra; this holds precisely when K is analytic in a neighborhood of the diagonal, exhibits exponential decay in |t-s|, and satisfies a positivity condition ensuring the induced inner product is positive semi-definite. Sufficient conditions are stated for kernels of the form K(t,s) = sum_{i=1}^m p_i(t) q_i(s) exp(lambda_i (t-s)) with analytic p_i, q_i and Re(lambda_i) < 0. We include concrete examples (standard exponential kernels, certain Matérn kernels) and discuss the minimal regularity on the path x required for convergence. The abstract and theorem statements have been updated to reference this characterization. These additions make the load-bearing assumptions explicit while preserving the original results for the delineated class. revision: yes

Circularity Check

0 steps flagged

No circularity: injectivity and approximation theorems rest on leveraged Volterra-Chen identity without reduction to inputs or self-citations.

full rationale

The paper derives its central injectivity statement and universal approximation theorem by leveraging the Volterra-Chen identity applied to the weighted tensor series VSig(x;K) for a class of exponential-type kernels admitting linear ODE representation. This identity is presented as an associated external tool rather than derived within the paper, and the proofs are stated as independent learning-theoretic guarantees. No equations reduce the claimed results to fitted parameters, self-definitions, or load-bearing self-citations; the kernel class is delimited by the identity's applicability without circular redefinition. Numerical experiments are separated from the theoretical claims, leaving the derivation chain self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claims rest on the Volterra-Chen identity for the weighted signature, standard properties of the tensor algebra, and the existence of a suitable class of kernels that admit both the ODE representation and the injectivity result. No free parameters are fitted inside the theoretical statements; the kernel K is treated as a modeling choice.

axioms (2)

domain assumption The Volterra-Chen identity holds for the kernel-weighted lift into the tensor algebra.
Invoked to derive injectivity and universal approximation from the construction.
domain assumption The chosen kernels belong to a class (exponential-type) for which the signature satisfies a linear state-space ODE.
Required for the computational tractability claim.

invented entities (1)

Volterra signature VSig(x;K) no independent evidence
purpose: Explicit feature representation for history-dependent paths
New object defined by weighting the path with kernel K before lifting to tensor algebra.

pith-pipeline@v0.9.0 · 5805 in / 1599 out tokens · 30675 ms · 2026-05-22T11:31:19.650058+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We prove an injectivity statement (identifiability under augmentation) that leads to a universal approximation theorem on the infinite dimensional path space, which in certain cases is achieved by linear functionals of VSig(x;K).
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

For a large class of exponential-type kernels, VSig(x;K) solves a linear state-space ODE in the tensor algebra.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Universal Approximation of Nonlinear Operators and Their Derivatives
cs.LG 2026-05 unverdicted novelty 8.0

Proves the first universal approximation theorems for k-times differentiable nonlinear operators between Banach spaces and their derivatives uniformly on compact sets in weighted Sobolev norms via encoder-decoder oper...
Adaptive Learning via Off-Model Training and Importance Sampling for Fully Non-Markovian Optimal Stochastic Control. Complete version
stat.ML 2026-04 unverdicted novelty 7.0

An off-model training architecture using explicit dominating laws and Radon-Nikodym weights enables adaptive learning for non-Markovian stochastic control, with non-asymptotic error bounds separating Monte Carlo and m...
Computational aspects of the Volterra Signature
math.NA 2026-05 unverdicted novelty 6.0

Algorithms for Volterra signature computation achieve O(J^2), O(J log J) via FFT, and O(J R^2) via recursion, plus a predictor-corrector scheme, all implemented in a public JAX package.

Reference graph

Works this paper leans on

80 extracted references · 80 canonical work pages · cited by 3 Pith papers · 4 internal anchors

[1]

Multifactor approximation of rough volatility models.SIAM journal on financial mathematics, 10(2):309–349, 2019

Eduardo Abi Jaber and Omar El Euch. Multifactor approximation of rough volatility models.SIAM journal on financial mathematics, 10(2):309–349, 2019. 17, 38

work page 2019
[2]

Hedging with memory: shallow and deep learning with signatures

Eduardo Abi Jaber and Louis-Amand Gérard. Hedging with memory: shallow and deep learning with signatures. 2025. 28

work page 2025
[3]

Volatility models in practice: Rough, path-dependent, or Markov- ian?Mathematical Finance, 2025

Eduardo Abi Jaber and Shaun Li. Volatility models in practice: Rough, path-dependent, or Markov- ian?Mathematical Finance, 2025. 17

work page 2025
[4]

Branched Signature Model

Munawar Ali and Qi Feng. Branched Signature Model. Preprint, arXiv:2511.00018 [math.NA] (2025),

work page arXiv 2025
[5]

Theory of reproducing kernels.Transactions of the American Mathematical So- ciety, 68(3):337–404, 1950

Nachman Aronszajn. Theory of reproducing kernels.Transactions of the American Mathematical So- ciety, 68(3):337–404, 1950. 30

work page 1950
[6]

Berlin, Heidelberg : Springer-Verlag Berlin Heidelberg, 2011

Hajer Bahouri, Jean-Yves Chemin, and Danchin Raphaël.Fourier Analysis and Nonlinear Partial Differential Equations, volume 343 ofGrundlehren der mathematischen Wissenschaften, A Series of Comprehensive Studies in Mathematics. Berlin, Heidelberg : Springer-Verlag Berlin Heidelberg, 2011. 28

work page 2011
[7]

Hager, Sebastian Riedel, and Tobias Nauen

Peter Bank, Christian Bayer, Paul P. Hager, Sebastian Riedel, and Tobias Nauen. Stochastic control with signatures.SIAM Journal on Control and Optimization, 63(5):3189–3218, 2025. 28, 30

work page 2025
[8]

Barndorff-Nielsen, Fred Espen Benth, and Almut E

Ole E. Barndorff-Nielsen, Fred Espen Benth, and Almut E. D. Veraart.Ambit Stochastics, volume 88 ofProbability Theory and Stochastic Modelling. Springer, Cham, 2018. 8

work page 2018
[9]

Markovian approximations of stochastic Volterra equations with the fractional kernel.Quantitative Finance, 23(1):53–70, 2023

Christian Bayer and Simon Breneis. Markovian approximations of stochastic Volterra equations with the fractional kernel.Quantitative Finance, 23(1):53–70, 2023. 17, 38

work page 2023
[10]

Springer Finance

Christian Bayer, Gonçalo dos Reis, Blanka Horvath, and Harald Oberhauser, editors.Signature Meth- ods in Finance: An Introduction with Computational Applications. Springer Finance. Springer, Cham,

work page
[11]

eBook published 07 Nov 2025;©2026 Springer Nature. 2

work page 2025
[12]

Friz, Paul Gassiat, Jorg Martin, and Benjamin Stemper

Christian Bayer, Peter K. Friz, Paul Gassiat, Jorg Martin, and Benjamin Stemper. A regularity structure for rough volatility.Math. Finance, 30(3):782–832, 2020. 3, 4

work page 2020
[13]

Optimal stopping with signatures.The Annals of Applied Probability, 33(1):238–273, 2023

Christian Bayer, Paul P Hager, Sebastian Riedel, and John Schoenmakers. Optimal stopping with signatures.The Annals of Applied Probability, 33(1):238–273, 2023. 28

work page 2023
[14]

Pricing American options under rough volatility using deep-signatures and signature-kernels.arXiv preprint arXiv:2501.06758, 2025

Christian Bayer, Luca Pelizzari, and Jia-Jie Zhu. Pricing American options under rough volatility using deep-signatures and signature-kernels.arXiv preprint arXiv:2501.06758, 2025. 28

work page arXiv 2025
[15]

Learning long-term dependencies with gradient descent is difficult.IEEE Transactions on Neural Networks, 5(2):157–166, 1994

Yoshua Bengio, Patrice Simard, and Paolo Frasconi. Learning long-term dependencies with gradient descent is difficult.IEEE Transactions on Neural Networks, 5(2):157–166, 1994. 1

work page 1994
[16]

Springer Finance

Fred Espen Benth and Paul Krühner.Stochastic Models for Prices Dynamics in Energy and Com- modity Markets. Springer Finance. Springer, Cham, November 2023. 8

work page 2023
[17]

Prömel, and David Scheffels

Martin Bergerhausen, David J. Prömel, and David Scheffels. Neural stochastic Volterra equations: Learning path-dependent dynamics.Journal of Machine Learning, 4(4):264–289, December 2025. 2

work page 2025
[18]

Berndt and James Clifford

Donald J. Berndt and James Clifford. Using dynamic time warping to find patterns in time series. In KDD Workshop, volume 10, pages 359–370, 1994. 30

work page 1994
[19]

The signature of a rough path: unique- ness.Advances in Mathematics, 293:720–737, 2016

Horatio Boedihardjo, Xi Geng, Terry Lyons, and Danyu Yang. The signature of a rough path: unique- ness.Advances in Mathematics, 293:720–737, 2016. 23

work page 2016
[20]

Boyd and L

S. Boyd and L. Chua. Fading memory and the problem of approximating nonlinear operators with Volterra series.IEEE Transactions on Circuits and Systems, 32(11):1150–1161, 1985. 1

work page 1985
[21]

Springer, New York, 2019

Fred Brauer, Carlos Castillo-Chavez, and Zhilan Feng.Mathematical models in epidemiology, vol- ume 69 ofTexts in Applied Mathematics. Springer, New York, 2019. With a foreword by Simon Levin. 1

work page 2019
[22]

Ramification of Volterra-type rough paths.Electronic Journal of Probability, 28:1–25, 2023

Yvain Bruned and Foivos Katsetsiadis. Ramification of Volterra-type rough paths.Electronic Journal of Probability, 28:1–25, 2023. 3, 4 THE VOLTERRA SIGNATURE 42

work page 2023
[23]

Burov and E

S. Burov and E. Barkai. Fractional Langevin equation: overdamped, underdamped, and critical be- haviors.Phys. Rev. E (3), 78(3):031112, 18, 2008. 1

work page 2008
[24]

A survey of commodity markets and structural models for elec- tricity prices

René Carmona and Michael Coulon. A survey of commodity markets and structural models for elec- tricity prices. InQuantitative energy finance, pages 41–83. Springer, New York, 2014. 1

work page 2014
[25]

Chan and Nuno Vasconcelos

Antoni B. Chan and Nuno Vasconcelos. Probabilistic kernels for the classification of auto-regressive visual processes. In2005 IEEE Computer Society Conference on Computer Vision and Pattern Recog- nition (CVPR’05), volume 1, pages 846–851. IEEE, 2005. 30

work page 2005
[26]

Kuo-TsaiChen.Iteratedintegralsandexponentialhomomorphisms.Proc. Lond. Math. Soc. (3), 4:502– 512, 1954. 8

work page 1954
[27]

Integration of paths, geometric invariants and a Generalized Baker–Hausdorff formula

Kuo-Tsai Chen. Integration of paths, geometric invariants and a Generalized Baker–Hausdorff formula. Annals of Mathematics, 65(1):163–178, 1957. 2

work page 1957
[28]

Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural ordinary differen- tial equations. InProceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, page 6572–6583, Red Hook, NY, USA, 2018. Curran Associates Inc. 2

work page 2018
[29]

Feature engineering with regularity struc- tures.J

Ilya Chevyrev, Andris Gerasimovičs, and Hendrik Weber. Feature engineering with regularity struc- tures.J. Sci. Comput., 98(1):28, 2024. Id/No 13. 4

work page 2024
[30]

A primer on the signature method in machine learning.arXiv preprint arXiv:1603.03788, 2016

Ilya Chevyrev and Andrey Kormilitzin. A primer on the signature method in machine learning.arXiv preprint arXiv:1603.03788, 2016. 24, 36

work page arXiv 2016
[31]

Comte and E

F. Comte and E. Renault. Long memory continuous time models.Journal of Econometrics, 73(1):101– 149, 1996. 1

work page 1996
[32]

Corduneanu.Integral equations and applications

C. Corduneanu.Integral equations and applications. Cambridge: Cambridge University Press, reprint of the 1991 hardback ed. edition, 2008. 2

work page 1991
[33]

A simple approximate long-memory model of realized volatility.Journal of financial econometrics, 7(2):174–196, 2009

Fulvio Corsi. A simple approximate long-memory model of realized volatility.Journal of financial econometrics, 7(2):174–196, 2009. 39

work page 2009
[34]

Springer Nature Switzerland, Cham, January 2026

Dan Crisan, Ilya Chevyrev, Thomas Cass, James Foster, Christian Litterer, and Cristopher Salvi, editors.Stochastic Analysis and Applications 2025: In Honour of Terry Lyons. Springer Nature Switzerland, Cham, January 2026. Hardcover. ISBN-10: 3032039134. xii+436 pp. eBook ISBN: 9783032039149. 2

work page 2025
[35]

Autoregressive Kernels For Time Series

Marco Cuturi and Arnaud Doucet. Autoregressive kernels for time series.arXiv preprint arXiv:1101.0673, 2011. 30

work page internal anchor Pith review Pith/arXiv arXiv 2011
[36]

Generalized iterated-sums signatures.J

Joscha Diehl, Kurusch Ebrahimi-Fard, and Nikolas Tapia. Generalized iterated-sums signatures.J. Algebra, 632:801–824, 2023. 4

work page 2023
[37]

Jeffrey L. Elman. Finding structure in time.Cognitive Science, 14(2):179–211, 1990. 1

work page 1990
[38]

Friz and Nicolas B

Peter K. Friz and Nicolas B. Victoir.Multidimensional stochastic processes as rough paths, volume 120 ofCambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2010. Theory and applications. 20

work page 2010
[39]

Volatility is rough.Quantitative Finance, 18(6):933–949, 2018

Jim Gatheral, Thibault Jaisson, and Mathieu Rosenbaum. Volatility is rough.Quantitative Finance, 18(6):933–949, 2018. 1

work page 2018
[40]

Pricing and calibration in the 4-factor path-dependent volatility model.Quantitative Finance, 25(3):471–489, 2025

Guido Gazzani and Julien Guyon. Pricing and calibration in the 4-factor path-dependent volatility model.Quantitative Finance, 25(3):471–489, 2025. 38

work page 2025
[41]

Kistler.Spiking Neuron Models: Single Neurons, Populations, Plasticity

Wulfram Gerstner and Werner M. Kistler.Spiking Neuron Models: Single Neurons, Populations, Plasticity. Cambridge University Press, Cambridge, 2002. 1

work page 2002
[42]

Number 34

Gustaf Gripenberg, Stig-Olof Londen, and Olof Staffans.Volterra integral and functional equations. Number 34. Cambridge University Press, 1990. 2, 17

work page 1990
[43]

Volatility is (mostly) path-dependent.Quant

Julien Guyon and Jordan Lekeufack. Volatility is (mostly) path-dependent.Quant. Finance, 23(9):1221–1258, 2023. 1, 17, 38, 39

work page 2023
[44]

Hager, Fabian N

Paul P. Hager, Fabian N. Harang, Luca Pelizzari, and Samy Tindel. Computational aspects of the Volterra signature. arXiv:xxxx.xxxxx, March 2026. 3, 10, 17, 18, 37

work page 2026
[45]

A Wong-Zakai theorem for stochastic PDEs.J

Martin Hairer and Étienne Pardoux. A Wong-Zakai theorem for stochastic PDEs.J. Math. Soc. Japan, 67(4):1551–1604, 2015. 3

work page 2015
[46]

Uniqueness for the signature of a path of bounded variation and the reduced path group.Ann

Ben Hambly and Terry Lyons. Uniqueness for the signature of a path of bounded variation and the reduced path group.Ann. of Math. (2), 171(1):109–167, 2010. 2, 23

work page 2010
[47]

Harang, Fred Espen Benth, and Fride Straum

Fabian A. Harang, Fred Espen Benth, and Fride Straum. Universal approximation on non-geometric rough paths and applications to financial derivatives pricing, 2024. 28

work page 2024
[48]

Harang and Samy Tindel

Fabian A. Harang and Samy Tindel. Volterra equations driven by rough signals.Stochastic Process. Appl., 142:34–78, 2021. 3, 4, 7, 8, 10

work page 2021
[49]

Harang, Samy Tindel, and Xiaohua Wang

Fabian A. Harang, Samy Tindel, and Xiaohua Wang. Volterra equations driven by rough signals 2: Higher-order expansions.Stoch. Dyn., 23(1):Paper No. 2350002, 50, 2023. 3, 4 THE VOLTERRA SIGNATURE 43

work page 2023
[50]

Oxford-man institute’s realized li- brary.Version 0.1, Oxford&Man Institute, University of Oxford, 2009

Gerd Heber, Asger Lunde, Neil Shephard, and Kevin Sheppard. Oxford-man institute’s realized li- brary.Version 0.1, Oxford&Man Institute, University of Oxford, 2009. 38

work page 2009
[51]

Higham.Functions of matrices

Nicholas J. Higham.Functions of matrices. Theory and computation. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM), 2008. 19

work page 2008
[52]

Long short-term memory.Neural Computation, 9(8):1735– 1780, 1997

Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory.Neural Computation, 9(8):1735– 1780, 1997. 1

work page 1997
[53]

Path-dependent processes from signa- tures, 2024

Eduardo Abi Jaber, Louis-Amand Gérard, and Yuxing Huang. Path-dependent processes from signa- tures, 2024. 37, 38

work page 2024
[54]

Exponentially fading memory signature.arXiv preprint arXiv:2507.03700, 2025

Eduardo Abi Jaber and Dimitri Sotnikov. Exponentially fading memory signature.arXiv preprint arXiv:2507.03700, 2025. 4

work page arXiv 2025
[55]

Michael I. Jordan. Serial order: A parallel distributed processing approach.Advances in psychology, 121:471–495, 1997. 1

work page 1997
[56]

Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang

George Em Karniadakis, Ioannis G. Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. Physics-informed machine learning.Nature Reviews Physics, 3(6):422–440, 2021. 1

work page 2021
[57]

James Kidger, James Morrill, Patrick T. P. Tang, and Terry Lyons. Neural controlled differential equations for irregular time series. InAdvances in Neural Information Processing Systems (NeurIPS),

work page
[58]

Kernels for sequentially ordered data

Franz J Király and Harald Oberhauser. Kernels for sequentially ordered data.arXiv preprint arXiv:1601.08169, 2016. 30

work page internal anchor Pith review Pith/arXiv arXiv 2016
[59]

Király and Harald Oberhauser

Franz J. Király and Harald Oberhauser. Kernels for sequentially ordered data.Journal of Machine Learning Research, 20(31):1–45, 2019. 2, 30, 33

work page 2019
[60]

Kloeden and Eckhard Platen

Peter E. Kloeden and Eckhard Platen. Stratonovich and Itô stochastic Taylor expansions.Math. Nachr., 151(1):33–50, 1991. 37

work page 1991
[61]

S. C. Kou. Stochastic modeling in nanoscale biophysics: subdiffusion within proteins.Ann. Appl. Stat., 2(2):501–535, 2008. 1

work page 2008
[62]

W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson. On the self-similar nature of ethernet traffic (extended version).IEEE/ACM Transactions on Networking, 2(1):1–15, 1994. 1

work page 1994
[63]

Zachary C. Lipton. The mythos of model interpretability.Commun. ACM, 61(10):36–43, September

work page
[64]

On a chen–fliess approximation for diffusion functionals

Christian Litterer and Harald Oberhauser. On a chen–fliess approximation for diffusion functionals. Monatshefte für Mathematik, 175(4):577–593, 2014. 29

work page 2014
[65]

Terry J. Lyons. Differential equations driven by rough signals.Rev. Mat. Iberoam., 14(2):215–310,

work page
[66]

Signature methods in machine learning.EMS Surv

Andrew McLeod and Terry Lyons. Signature methods in machine learning.EMS Surv. Math. Sci., February 2025. Published online first (19 February 2025). 2

work page 2025
[67]

Moreno, Purdy Ho, and Nuno Vasconcelos

Pedro J. Moreno, Purdy Ho, and Nuno Vasconcelos. A kullback-leibler divergence based kernel for svm classification in multimedia applications. InAdvances in Neural Information Processing Systems (NIPS), pages 1385–1392, 2003. 30

work page 2003
[68]

Birkhäuser Basel, Basel, 1993

Bernt Øksendal and Tu-Sheng Zhang.The Stochastic Volterra Equation, pages 168–202. Birkhäuser Basel, Basel, 1993. 2

work page 1993
[69]

Stochastic Volterra equations with anticipating coefficients.Ann

Etienne Pardoux and Philip Protter. Stochastic Volterra equations with anticipating coefficients.Ann. Probab., 18(4):1635–1655, 1990. 2

work page 1990
[70]

On the difficulty of training recurrent neural networks

Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. On the difficulty of training recurrent neural networks. InProceedings of the 30th International Conference on Machine Learning (ICML 2013), pages 1310–1318, 2013. 1

work page 2013
[71]

Volterra equations driven by semimartingales.Ann

Philip Protter. Volterra equations driven by semimartingales.Ann. Probab., 13(2):519–530, 1985. 36

work page 1985
[72]

Karniadakis

Maziar Raissi, Paris Perdikaris, and George E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations.Journal of Computational Physics, 378:686–707, 2019. 1

work page 2019
[73]

The iisignature library: efficient calculation of iterated-integral signatures and log signatures

Jeremy Reizenstein and Benjamin Graham. The iisignature library: efficient calculation of iterated- integral signatures and log signatures.arXiv preprint arXiv:1802.08252, 2018. 37

work page internal anchor Pith review Pith/arXiv arXiv 2018
[74]

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature Machine Intelligence, 1:206–215, 2019

Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature Machine Intelligence, 1:206–215, 2019. 1

work page 2019
[75]

The signature kernel is the solution of a goursat pde.SIAM Journal on Mathematics of Data Science, 3(3):873–899, 2021

Cristopher Salvi, Thomas Cass, James Foster, Terry Lyons, and Weixin Yang. The signature kernel is the solution of a goursat pde.SIAM Journal on Mathematics of Data Science, 3(3):873–899, 2021. 2, 30, 31

work page 2021
[76]

Smola.Learning with Kernels: Support Vector Machines, Reg- ularization, Optimization, and Beyond

Bernhard Schölkopf and Alexander J. Smola.Learning with Kernels: Support Vector Machines, Reg- ularization, Optimization, and Beyond. MIT Press, Cambridge, MA, 2002. 30 THE VOLTERRA SIGNATURE 44

work page 2002
[77]

Über die Approximation stetiger Funktionen durch lineare Aggregate von Potenzen.Math- ematische Annalen, 77(4):482–496, 1916

Otto Szász. Über die Approximation stetiger Funktionen durch lineare Aggregate von Potenzen.Math- ematische Annalen, 77(4):482–496, 1916. 26

work page 1916
[78]

Gomez, Łukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Systems, 2017. 1

work page 2017
[79]

Linformer: Self-Attention with Linear Complexity

Sinong Wang, Belinda Z Li, Madian Khabsa, Han Fang, and Hao Ma. Linformer: Self-attention with linear complexity.arXiv preprint arXiv:2006.04768, 2020. 1

work page internal anchor Pith review Pith/arXiv arXiv 2006
[80]

Learning integral operators via neural integral equations.Nature Machine Intelligence, 6(9):1046–1062, September 2024

EmanueleZappala, AntonioHenriquedeOliveiraFonseca, JosueOrtegaCaro, AndrewHenryMoberly, Michael James Higley, Jessica Cardin, and David van Dijk. Learning integral operators via neural integral equations.Nature Machine Intelligence, 6(9):1046–1062, September 2024. 2 Paul P. Hager, Department of Statistics and Operations Research, University of Vienna, Kol...

work page 2024

[1] [1]

Multifactor approximation of rough volatility models.SIAM journal on financial mathematics, 10(2):309–349, 2019

Eduardo Abi Jaber and Omar El Euch. Multifactor approximation of rough volatility models.SIAM journal on financial mathematics, 10(2):309–349, 2019. 17, 38

work page 2019

[2] [2]

Hedging with memory: shallow and deep learning with signatures

Eduardo Abi Jaber and Louis-Amand Gérard. Hedging with memory: shallow and deep learning with signatures. 2025. 28

work page 2025

[3] [3]

Volatility models in practice: Rough, path-dependent, or Markov- ian?Mathematical Finance, 2025

Eduardo Abi Jaber and Shaun Li. Volatility models in practice: Rough, path-dependent, or Markov- ian?Mathematical Finance, 2025. 17

work page 2025

[4] [4]

Branched Signature Model

Munawar Ali and Qi Feng. Branched Signature Model. Preprint, arXiv:2511.00018 [math.NA] (2025),

work page arXiv 2025

[5] [5]

Theory of reproducing kernels.Transactions of the American Mathematical So- ciety, 68(3):337–404, 1950

Nachman Aronszajn. Theory of reproducing kernels.Transactions of the American Mathematical So- ciety, 68(3):337–404, 1950. 30

work page 1950

[6] [6]

Berlin, Heidelberg : Springer-Verlag Berlin Heidelberg, 2011

Hajer Bahouri, Jean-Yves Chemin, and Danchin Raphaël.Fourier Analysis and Nonlinear Partial Differential Equations, volume 343 ofGrundlehren der mathematischen Wissenschaften, A Series of Comprehensive Studies in Mathematics. Berlin, Heidelberg : Springer-Verlag Berlin Heidelberg, 2011. 28

work page 2011

[7] [7]

Hager, Sebastian Riedel, and Tobias Nauen

Peter Bank, Christian Bayer, Paul P. Hager, Sebastian Riedel, and Tobias Nauen. Stochastic control with signatures.SIAM Journal on Control and Optimization, 63(5):3189–3218, 2025. 28, 30

work page 2025

[8] [8]

Barndorff-Nielsen, Fred Espen Benth, and Almut E

Ole E. Barndorff-Nielsen, Fred Espen Benth, and Almut E. D. Veraart.Ambit Stochastics, volume 88 ofProbability Theory and Stochastic Modelling. Springer, Cham, 2018. 8

work page 2018

[9] [9]

Markovian approximations of stochastic Volterra equations with the fractional kernel.Quantitative Finance, 23(1):53–70, 2023

Christian Bayer and Simon Breneis. Markovian approximations of stochastic Volterra equations with the fractional kernel.Quantitative Finance, 23(1):53–70, 2023. 17, 38

work page 2023

[10] [10]

Springer Finance

Christian Bayer, Gonçalo dos Reis, Blanka Horvath, and Harald Oberhauser, editors.Signature Meth- ods in Finance: An Introduction with Computational Applications. Springer Finance. Springer, Cham,

work page

[11] [11]

eBook published 07 Nov 2025;©2026 Springer Nature. 2

work page 2025

[12] [12]

Friz, Paul Gassiat, Jorg Martin, and Benjamin Stemper

Christian Bayer, Peter K. Friz, Paul Gassiat, Jorg Martin, and Benjamin Stemper. A regularity structure for rough volatility.Math. Finance, 30(3):782–832, 2020. 3, 4

work page 2020

[13] [13]

Optimal stopping with signatures.The Annals of Applied Probability, 33(1):238–273, 2023

Christian Bayer, Paul P Hager, Sebastian Riedel, and John Schoenmakers. Optimal stopping with signatures.The Annals of Applied Probability, 33(1):238–273, 2023. 28

work page 2023

[14] [14]

Pricing American options under rough volatility using deep-signatures and signature-kernels.arXiv preprint arXiv:2501.06758, 2025

Christian Bayer, Luca Pelizzari, and Jia-Jie Zhu. Pricing American options under rough volatility using deep-signatures and signature-kernels.arXiv preprint arXiv:2501.06758, 2025. 28

work page arXiv 2025

[15] [15]

Learning long-term dependencies with gradient descent is difficult.IEEE Transactions on Neural Networks, 5(2):157–166, 1994

Yoshua Bengio, Patrice Simard, and Paolo Frasconi. Learning long-term dependencies with gradient descent is difficult.IEEE Transactions on Neural Networks, 5(2):157–166, 1994. 1

work page 1994

[16] [16]

Springer Finance

Fred Espen Benth and Paul Krühner.Stochastic Models for Prices Dynamics in Energy and Com- modity Markets. Springer Finance. Springer, Cham, November 2023. 8

work page 2023

[17] [17]

Prömel, and David Scheffels

Martin Bergerhausen, David J. Prömel, and David Scheffels. Neural stochastic Volterra equations: Learning path-dependent dynamics.Journal of Machine Learning, 4(4):264–289, December 2025. 2

work page 2025

[18] [18]

Berndt and James Clifford

Donald J. Berndt and James Clifford. Using dynamic time warping to find patterns in time series. In KDD Workshop, volume 10, pages 359–370, 1994. 30

work page 1994

[19] [19]

The signature of a rough path: unique- ness.Advances in Mathematics, 293:720–737, 2016

Horatio Boedihardjo, Xi Geng, Terry Lyons, and Danyu Yang. The signature of a rough path: unique- ness.Advances in Mathematics, 293:720–737, 2016. 23

work page 2016

[20] [20]

Boyd and L

S. Boyd and L. Chua. Fading memory and the problem of approximating nonlinear operators with Volterra series.IEEE Transactions on Circuits and Systems, 32(11):1150–1161, 1985. 1

work page 1985

[21] [21]

Springer, New York, 2019

Fred Brauer, Carlos Castillo-Chavez, and Zhilan Feng.Mathematical models in epidemiology, vol- ume 69 ofTexts in Applied Mathematics. Springer, New York, 2019. With a foreword by Simon Levin. 1

work page 2019

[22] [22]

Ramification of Volterra-type rough paths.Electronic Journal of Probability, 28:1–25, 2023

Yvain Bruned and Foivos Katsetsiadis. Ramification of Volterra-type rough paths.Electronic Journal of Probability, 28:1–25, 2023. 3, 4 THE VOLTERRA SIGNATURE 42

work page 2023

[23] [23]

Burov and E

S. Burov and E. Barkai. Fractional Langevin equation: overdamped, underdamped, and critical be- haviors.Phys. Rev. E (3), 78(3):031112, 18, 2008. 1

work page 2008

[24] [24]

A survey of commodity markets and structural models for elec- tricity prices

René Carmona and Michael Coulon. A survey of commodity markets and structural models for elec- tricity prices. InQuantitative energy finance, pages 41–83. Springer, New York, 2014. 1

work page 2014

[25] [25]

Chan and Nuno Vasconcelos

Antoni B. Chan and Nuno Vasconcelos. Probabilistic kernels for the classification of auto-regressive visual processes. In2005 IEEE Computer Society Conference on Computer Vision and Pattern Recog- nition (CVPR’05), volume 1, pages 846–851. IEEE, 2005. 30

work page 2005

[26] [26]

Kuo-TsaiChen.Iteratedintegralsandexponentialhomomorphisms.Proc. Lond. Math. Soc. (3), 4:502– 512, 1954. 8

work page 1954

[27] [27]

Integration of paths, geometric invariants and a Generalized Baker–Hausdorff formula

Kuo-Tsai Chen. Integration of paths, geometric invariants and a Generalized Baker–Hausdorff formula. Annals of Mathematics, 65(1):163–178, 1957. 2

work page 1957

[28] [28]

Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural ordinary differen- tial equations. InProceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, page 6572–6583, Red Hook, NY, USA, 2018. Curran Associates Inc. 2

work page 2018

[29] [29]

Feature engineering with regularity struc- tures.J

Ilya Chevyrev, Andris Gerasimovičs, and Hendrik Weber. Feature engineering with regularity struc- tures.J. Sci. Comput., 98(1):28, 2024. Id/No 13. 4

work page 2024

[30] [30]

A primer on the signature method in machine learning.arXiv preprint arXiv:1603.03788, 2016

Ilya Chevyrev and Andrey Kormilitzin. A primer on the signature method in machine learning.arXiv preprint arXiv:1603.03788, 2016. 24, 36

work page arXiv 2016

[31] [31]

Comte and E

F. Comte and E. Renault. Long memory continuous time models.Journal of Econometrics, 73(1):101– 149, 1996. 1

work page 1996

[32] [32]

Corduneanu.Integral equations and applications

C. Corduneanu.Integral equations and applications. Cambridge: Cambridge University Press, reprint of the 1991 hardback ed. edition, 2008. 2

work page 1991

[33] [33]

A simple approximate long-memory model of realized volatility.Journal of financial econometrics, 7(2):174–196, 2009

Fulvio Corsi. A simple approximate long-memory model of realized volatility.Journal of financial econometrics, 7(2):174–196, 2009. 39

work page 2009

[34] [34]

Springer Nature Switzerland, Cham, January 2026

Dan Crisan, Ilya Chevyrev, Thomas Cass, James Foster, Christian Litterer, and Cristopher Salvi, editors.Stochastic Analysis and Applications 2025: In Honour of Terry Lyons. Springer Nature Switzerland, Cham, January 2026. Hardcover. ISBN-10: 3032039134. xii+436 pp. eBook ISBN: 9783032039149. 2

work page 2025

[35] [35]

Autoregressive Kernels For Time Series

Marco Cuturi and Arnaud Doucet. Autoregressive kernels for time series.arXiv preprint arXiv:1101.0673, 2011. 30

work page internal anchor Pith review Pith/arXiv arXiv 2011

[36] [36]

Generalized iterated-sums signatures.J

Joscha Diehl, Kurusch Ebrahimi-Fard, and Nikolas Tapia. Generalized iterated-sums signatures.J. Algebra, 632:801–824, 2023. 4

work page 2023

[37] [37]

Jeffrey L. Elman. Finding structure in time.Cognitive Science, 14(2):179–211, 1990. 1

work page 1990

[38] [38]

Friz and Nicolas B

Peter K. Friz and Nicolas B. Victoir.Multidimensional stochastic processes as rough paths, volume 120 ofCambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2010. Theory and applications. 20

work page 2010

[39] [39]

Volatility is rough.Quantitative Finance, 18(6):933–949, 2018

Jim Gatheral, Thibault Jaisson, and Mathieu Rosenbaum. Volatility is rough.Quantitative Finance, 18(6):933–949, 2018. 1

work page 2018

[40] [40]

Pricing and calibration in the 4-factor path-dependent volatility model.Quantitative Finance, 25(3):471–489, 2025

Guido Gazzani and Julien Guyon. Pricing and calibration in the 4-factor path-dependent volatility model.Quantitative Finance, 25(3):471–489, 2025. 38

work page 2025

[41] [41]

Kistler.Spiking Neuron Models: Single Neurons, Populations, Plasticity

Wulfram Gerstner and Werner M. Kistler.Spiking Neuron Models: Single Neurons, Populations, Plasticity. Cambridge University Press, Cambridge, 2002. 1

work page 2002

[42] [42]

Number 34

Gustaf Gripenberg, Stig-Olof Londen, and Olof Staffans.Volterra integral and functional equations. Number 34. Cambridge University Press, 1990. 2, 17

work page 1990

[43] [43]

Volatility is (mostly) path-dependent.Quant

Julien Guyon and Jordan Lekeufack. Volatility is (mostly) path-dependent.Quant. Finance, 23(9):1221–1258, 2023. 1, 17, 38, 39

work page 2023

[44] [44]

Hager, Fabian N

Paul P. Hager, Fabian N. Harang, Luca Pelizzari, and Samy Tindel. Computational aspects of the Volterra signature. arXiv:xxxx.xxxxx, March 2026. 3, 10, 17, 18, 37

work page 2026

[45] [45]

A Wong-Zakai theorem for stochastic PDEs.J

Martin Hairer and Étienne Pardoux. A Wong-Zakai theorem for stochastic PDEs.J. Math. Soc. Japan, 67(4):1551–1604, 2015. 3

work page 2015

[46] [46]

Uniqueness for the signature of a path of bounded variation and the reduced path group.Ann

Ben Hambly and Terry Lyons. Uniqueness for the signature of a path of bounded variation and the reduced path group.Ann. of Math. (2), 171(1):109–167, 2010. 2, 23

work page 2010

[47] [47]

Harang, Fred Espen Benth, and Fride Straum

Fabian A. Harang, Fred Espen Benth, and Fride Straum. Universal approximation on non-geometric rough paths and applications to financial derivatives pricing, 2024. 28

work page 2024

[48] [48]

Harang and Samy Tindel

Fabian A. Harang and Samy Tindel. Volterra equations driven by rough signals.Stochastic Process. Appl., 142:34–78, 2021. 3, 4, 7, 8, 10

work page 2021

[49] [49]

Harang, Samy Tindel, and Xiaohua Wang

Fabian A. Harang, Samy Tindel, and Xiaohua Wang. Volterra equations driven by rough signals 2: Higher-order expansions.Stoch. Dyn., 23(1):Paper No. 2350002, 50, 2023. 3, 4 THE VOLTERRA SIGNATURE 43

work page 2023

[50] [50]

Oxford-man institute’s realized li- brary.Version 0.1, Oxford&Man Institute, University of Oxford, 2009

Gerd Heber, Asger Lunde, Neil Shephard, and Kevin Sheppard. Oxford-man institute’s realized li- brary.Version 0.1, Oxford&Man Institute, University of Oxford, 2009. 38

work page 2009

[51] [51]

Higham.Functions of matrices

Nicholas J. Higham.Functions of matrices. Theory and computation. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM), 2008. 19

work page 2008

[52] [52]

Long short-term memory.Neural Computation, 9(8):1735– 1780, 1997

Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory.Neural Computation, 9(8):1735– 1780, 1997. 1

work page 1997

[53] [53]

Path-dependent processes from signa- tures, 2024

Eduardo Abi Jaber, Louis-Amand Gérard, and Yuxing Huang. Path-dependent processes from signa- tures, 2024. 37, 38

work page 2024

[54] [54]

Exponentially fading memory signature.arXiv preprint arXiv:2507.03700, 2025

Eduardo Abi Jaber and Dimitri Sotnikov. Exponentially fading memory signature.arXiv preprint arXiv:2507.03700, 2025. 4

work page arXiv 2025

[55] [55]

Michael I. Jordan. Serial order: A parallel distributed processing approach.Advances in psychology, 121:471–495, 1997. 1

work page 1997

[56] [56]

Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang

George Em Karniadakis, Ioannis G. Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. Physics-informed machine learning.Nature Reviews Physics, 3(6):422–440, 2021. 1

work page 2021

[57] [57]

James Kidger, James Morrill, Patrick T. P. Tang, and Terry Lyons. Neural controlled differential equations for irregular time series. InAdvances in Neural Information Processing Systems (NeurIPS),

work page

[58] [58]

Kernels for sequentially ordered data

Franz J Király and Harald Oberhauser. Kernels for sequentially ordered data.arXiv preprint arXiv:1601.08169, 2016. 30

work page internal anchor Pith review Pith/arXiv arXiv 2016

[59] [59]

Király and Harald Oberhauser

Franz J. Király and Harald Oberhauser. Kernels for sequentially ordered data.Journal of Machine Learning Research, 20(31):1–45, 2019. 2, 30, 33

work page 2019

[60] [60]

Kloeden and Eckhard Platen

Peter E. Kloeden and Eckhard Platen. Stratonovich and Itô stochastic Taylor expansions.Math. Nachr., 151(1):33–50, 1991. 37

work page 1991

[61] [61]

S. C. Kou. Stochastic modeling in nanoscale biophysics: subdiffusion within proteins.Ann. Appl. Stat., 2(2):501–535, 2008. 1

work page 2008

[62] [62]

W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson. On the self-similar nature of ethernet traffic (extended version).IEEE/ACM Transactions on Networking, 2(1):1–15, 1994. 1

work page 1994

[63] [63]

Zachary C. Lipton. The mythos of model interpretability.Commun. ACM, 61(10):36–43, September

work page

[64] [64]

On a chen–fliess approximation for diffusion functionals

Christian Litterer and Harald Oberhauser. On a chen–fliess approximation for diffusion functionals. Monatshefte für Mathematik, 175(4):577–593, 2014. 29

work page 2014

[65] [65]

Terry J. Lyons. Differential equations driven by rough signals.Rev. Mat. Iberoam., 14(2):215–310,

work page

[66] [66]

Signature methods in machine learning.EMS Surv

Andrew McLeod and Terry Lyons. Signature methods in machine learning.EMS Surv. Math. Sci., February 2025. Published online first (19 February 2025). 2

work page 2025

[67] [67]

Moreno, Purdy Ho, and Nuno Vasconcelos

Pedro J. Moreno, Purdy Ho, and Nuno Vasconcelos. A kullback-leibler divergence based kernel for svm classification in multimedia applications. InAdvances in Neural Information Processing Systems (NIPS), pages 1385–1392, 2003. 30

work page 2003

[68] [68]

Birkhäuser Basel, Basel, 1993

Bernt Øksendal and Tu-Sheng Zhang.The Stochastic Volterra Equation, pages 168–202. Birkhäuser Basel, Basel, 1993. 2

work page 1993

[69] [69]

Stochastic Volterra equations with anticipating coefficients.Ann

Etienne Pardoux and Philip Protter. Stochastic Volterra equations with anticipating coefficients.Ann. Probab., 18(4):1635–1655, 1990. 2

work page 1990

[70] [70]

On the difficulty of training recurrent neural networks

Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. On the difficulty of training recurrent neural networks. InProceedings of the 30th International Conference on Machine Learning (ICML 2013), pages 1310–1318, 2013. 1

work page 2013

[71] [71]

Volterra equations driven by semimartingales.Ann

Philip Protter. Volterra equations driven by semimartingales.Ann. Probab., 13(2):519–530, 1985. 36

work page 1985

[72] [72]

Karniadakis

Maziar Raissi, Paris Perdikaris, and George E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations.Journal of Computational Physics, 378:686–707, 2019. 1

work page 2019

[73] [73]

The iisignature library: efficient calculation of iterated-integral signatures and log signatures

Jeremy Reizenstein and Benjamin Graham. The iisignature library: efficient calculation of iterated- integral signatures and log signatures.arXiv preprint arXiv:1802.08252, 2018. 37

work page internal anchor Pith review Pith/arXiv arXiv 2018

[74] [74]

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature Machine Intelligence, 1:206–215, 2019

Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature Machine Intelligence, 1:206–215, 2019. 1

work page 2019

[75] [75]

The signature kernel is the solution of a goursat pde.SIAM Journal on Mathematics of Data Science, 3(3):873–899, 2021

Cristopher Salvi, Thomas Cass, James Foster, Terry Lyons, and Weixin Yang. The signature kernel is the solution of a goursat pde.SIAM Journal on Mathematics of Data Science, 3(3):873–899, 2021. 2, 30, 31

work page 2021

[76] [76]

Smola.Learning with Kernels: Support Vector Machines, Reg- ularization, Optimization, and Beyond

Bernhard Schölkopf and Alexander J. Smola.Learning with Kernels: Support Vector Machines, Reg- ularization, Optimization, and Beyond. MIT Press, Cambridge, MA, 2002. 30 THE VOLTERRA SIGNATURE 44

work page 2002

[77] [77]

Über die Approximation stetiger Funktionen durch lineare Aggregate von Potenzen.Math- ematische Annalen, 77(4):482–496, 1916

Otto Szász. Über die Approximation stetiger Funktionen durch lineare Aggregate von Potenzen.Math- ematische Annalen, 77(4):482–496, 1916. 26

work page 1916

[78] [78]

Gomez, Łukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Systems, 2017. 1

work page 2017

[79] [79]

Linformer: Self-Attention with Linear Complexity

Sinong Wang, Belinda Z Li, Madian Khabsa, Han Fang, and Hao Ma. Linformer: Self-attention with linear complexity.arXiv preprint arXiv:2006.04768, 2020. 1

work page internal anchor Pith review Pith/arXiv arXiv 2006

[80] [80]

Learning integral operators via neural integral equations.Nature Machine Intelligence, 6(9):1046–1062, September 2024

EmanueleZappala, AntonioHenriquedeOliveiraFonseca, JosueOrtegaCaro, AndrewHenryMoberly, Michael James Higley, Jessica Cardin, and David van Dijk. Learning integral operators via neural integral equations.Nature Machine Intelligence, 6(9):1046–1062, September 2024. 2 Paul P. Hager, Department of Statistics and Operations Research, University of Vienna, Kol...

work page 2024