Signature transforms approximate path-dependent nonlinear rewards as linear functionals, enabling the DisSigUCB algorithm with a high-probability regret bound of order O(sqrt((d+m)KT)).
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
fields
cs.LG 2years
2026 2roles
background 2polarities
background 2representative citing papers
ARL lifts states into signature-augmented manifolds and employs self-consistent proxies of future path-laws to enable deterministic expected-return evaluation while preserving contraction mappings in jump-diffusion environments.
citing papers explorer
-
Signature Approach for Contextual Bandits with Nonlinear and Path-dependent Rewards
Signature transforms approximate path-dependent nonlinear rewards as linear functionals, enabling the DisSigUCB algorithm with a high-probability regret bound of order O(sqrt((d+m)KT)).
-
Anticipatory Reinforcement Learning: From Generative Path-Laws to Distributional Value Functions
ARL lifts states into signature-augmented manifolds and employs self-consistent proxies of future path-laws to enable deterministic expected-return evaluation while preserving contraction mappings in jump-diffusion environments.