Efficient Simulation and Calibration of the Rough Bergomi Model via Wasserstein Distance

arxiv: 2512.00448 · v2 · submitted 2025-11-29 · 💱 q-fin.CP

Efficient Simulation and Calibration of the Rough Bergomi Model via Wasserstein Distance

Changqing Teng , Guanglian Li This is my paper

Pith reviewed 2026-05-17 03:32 UTC · model grok-4.3

classification 💱 q-fin.CP

keywords rough Bergomi modelMonte Carlo simulationsum-of-exponentials approximationWasserstein distancemodel calibrationvolatility modelingoption pricingnon-Markovian processes

0 comments p. Extension

The pith

The rough Bergomi model can be simulated at linear cost in time steps and calibrated by matching entire terminal distributions with Wasserstein distance instead of discrete option prices.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an efficient way to price and calibrate the rough Bergomi model, which captures realistic volatility roughness but has been hard to use because of its non-Markovian structure. The authors introduce a modified sum-of-exponentials Monte Carlo method that treats the singular kernel exactly in the first step and approximates it with a fixed number of exponentials afterward, keeping overall cost linear in the number of time steps. This engine is then paired with a calibration procedure that matches the full distribution of the asset price at maturity using the Wasserstein-1 distance rather than minimizing squared errors on a handful of strikes. Experiments show the simulation remains accurate, especially for out-of-the-money options, while the distributional calibration recovers parameters more reliably and produces more stable fits with better out-of-sample behavior.

Core claim

The modified-sum-of-exponentials Monte Carlo scheme combines exact treatment of the singular kernel over the first time step with a sum-of-exponentials approximation thereafter and exact Gaussian simulation of the resulting multifactor components; for a fixed number of terms it maintains linear online complexity with respect to time steps and delivers high pricing accuracy, particularly for out-of-the-money options. Building on this engine, calibration is performed by matching the model-generated terminal distribution of the underlying asset to the market-implied distribution via the Wasserstein-1 distance and its Kantorovich-Rubinstein dual representation, which improves parameter recovery,

What carries the argument

The modified-sum-of-exponentials (mSOE) Monte Carlo scheme inside a hybrid multifactor approximation that handles the rough kernel exactly at the first step and with a fixed exponential sum afterward while preserving linear complexity.

If this is right

Simulation cost scales linearly with the number of time steps once the number of exponential terms is fixed.
Pricing errors remain small for out-of-the-money options even with the hybrid approximation.
Calibration optimization becomes more stable because the objective compares full terminal distributions rather than isolated strikes.
Out-of-sample pricing performance improves when parameters are obtained from Wasserstein matching.
The same pricing engine can be reused for both calibration and subsequent valuation without changing complexity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The linear-complexity engine could support real-time risk calculations that were previously too slow under rough-volatility dynamics.
Distributional matching may automatically capture higher-order moments and tail behavior that strike-by-strike fitting often misses.
The approach might extend to other rough-volatility models whose kernels admit similar exponential approximations.
Testing the method on intraday or high-frequency data could reveal whether the fixed-exponential assumption holds under shorter time horizons.

Load-bearing premise

A fixed number of exponential terms in the mSOE approximation remains accurate enough for the singular kernel across the parameter regimes and time horizons used in the pricing and calibration experiments.

What would settle it

Generate paths with the true rough Bergomi parameters, apply the Wasserstein calibration on simulated option data, and check whether the recovered parameters converge to the known true values as the number of Monte Carlo paths increases.

Figures

Figures reproduced from arXiv: 2512.00448 by Changqing Teng, Guanglian Li.

**Figure 1.** Figure 1: displays the implied volatility smiles generated by the Cholesky factorization, SOE, and mSOE scheme for a fixed number of time steps n = 128. The SOE scheme exhibits minor deviations from the benchmark for at-the-money strikes and significant inaccuracies for out-of-the-money strikes. In contrast, the mSOE scheme effectively mitigates these discrepancies. When N = 16, the mSOE smile becomes visually indi… view at source ↗

**Figure 2.** Figure 2: Maximal relative errors of implied volatility smiles using [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗

**Figure 3.** Figure 3: Weak convergence rate of maximal relative errors of implied volatility smiles for [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗

**Figure 4.** Figure 4: Maximal relative errors of implied volatility surfaces using [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗

**Figure 5.** Figure 5: Weak convergence rate of maximal relative errors of implied volatility surface for [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗

**Figure 6.** Figure 6: From top to bottom: the price of DOP and UOC options based on different model [PITH_FULL_IMAGE:figures/full_fig_p022_6.png] view at source ↗

**Figure 7.** Figure 7: From left to right: the log density plot of future states at [PITH_FULL_IMAGE:figures/full_fig_p023_7.png] view at source ↗

**Figure 8.** Figure 8: Contour plot of Wasserstein-1 distance. 23 [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗

**Figure 9.** Figure 9: Contour plot of MSE. 3. Neural network-enhanced Nelson-Siegel model: to combine the structural prior of the NS model with the universal approximation capability of neural networks, we use a small neural network to learn a residual correction to the NS model ξ NS+NN 0 (t) = [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗

**Figure 10.** Figure 10: The calibrated ξ0(t) using different parameterizations. 5 Conclusion This work introduced a comprehensive framework that addresses the dual challenges of efficient pricing and robust calibration in the rough Bergomi model. First, we developed a modified Sum-of-Exponentials (mSOE) Monte Carlo scheme by hybridizing an exact treatment of the kernel singularity at the origin with a high-fidelity sumof-expon… view at source ↗

read the original abstract

Despite the empirical success of the rough Bergomi (rBergomi) model in modeling volatility dynamics, its practical use remains challenging due to high computational complexity in both pricing and calibration arising from its non-Markovian structure. To address these difficulties, we develop an efficient computational framework. First, we propose a modified-sum-of-exponentials (mSOE) Monte Carlo scheme within the class of hybrid multifactor approximations. The method combines an exact treatment of the singular kernel over the first time step with a sum-of-exponentials approximation over the remaining time interval, and exact Gaussian simulation of the resulting multifactor components. For a fixed number of exponential terms, the method maintains linear online complexity with respect to the number of time steps. It achieves high pricing accuracy in numerical experiments, particularly for out-of-the-money options. Second, building on this pricing engine, we formulate a calibration approach based on distributional matching of the terminal underlying asset via the Wasserstein-1 distance. Instead of fitting option prices only at selected strikes, this method compares model-generated and market-implied terminal distributions through the Kantorovich-Rubinstein dual representation. Numerical experiments indicate that the mSOE scheme exhibits stable convergence, and the Wasserstein-based calibration scheme improves parameter recovery, optimization stability, and out-of-sample performance relative to conventional MSE-based fitting in the rBergomi setting considered in this paper.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The mSOE hybrid scheme plus Wasserstein calibration gives a workable speed-up for rBergomi pricing and fitting, but the fixed-term approximation still needs tighter error control for rougher regimes.

read the letter

The paper's core move is a hybrid Monte Carlo scheme that handles the singular kernel exactly on the first step and then switches to a fixed sum-of-exponentials approximation for the rest, keeping online cost linear in the number of steps. They pair this engine with a Wasserstein-1 calibration that matches the full terminal distribution of the asset rather than a handful of option prices. Both pieces are new in this combination for the rBergomi setting and address a real bottleneck that has kept the model from wider routine use.

Referee Report

2 major / 2 minor

Summary. The paper proposes a modified sum-of-exponentials (mSOE) Monte Carlo scheme for the rough Bergomi model that treats the singular kernel exactly on the first time step and approximates the remainder with a fixed number of exponentials, yielding linear online complexity and high accuracy for out-of-the-money options in reported experiments; it further introduces a Wasserstein-1 distance calibration that matches terminal distributions via the Kantorovich-Rubinstein dual and reports improved parameter recovery, optimization stability, and out-of-sample performance relative to MSE-based fitting.

Significance. If the mSOE approximation error remains controlled, the framework would supply a practical, linearly scaling simulation engine and a more stable distributional calibration method for rough volatility models that are widely used in quantitative finance. The numerical experiments demonstrating stable convergence of the pricing scheme and better out-of-sample behavior of the Wasserstein objective constitute a concrete strength that supports the practical utility of the approach.

major comments (2)

[§3] §3 (mSOE Monte Carlo scheme): the central accuracy claim for OTM options and the downstream Wasserstein calibration both rest on the fixed-term sum-of-exponentials approximation controlling truncation error for the kernel K(t)∼t^{H−1/2}. No a priori error bound is supplied, and the experiments do not systematically vary H down to 0.05 or extend T while holding the number of exponential terms constant; this is load-bearing for the reported pricing accuracy and calibration improvements.
[§5] §5 (numerical experiments): the reported high accuracy and stable convergence are shown only for the chosen parameter regimes; without an ablation that increases the number of time steps or decreases H while keeping the exponential term count fixed, it is unclear whether the linear-complexity claim continues to deliver Monte-Carlo tolerance accuracy outside the tested window.

minor comments (2)

[Abstract] Abstract: the phrase 'the rBergomi setting considered in this paper' is vague; a brief statement of the H and T ranges actually tested would improve clarity.
Notation: the distinction between the exact first-step kernel treatment and the subsequent mSOE approximation could be highlighted with a dedicated equation or diagram to aid readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed report. The comments highlight important aspects of the theoretical and numerical support for the mSOE scheme. We address each major comment below and outline the revisions we will make to strengthen the manuscript.

read point-by-point responses

Referee: [§3] §3 (mSOE Monte Carlo scheme): the central accuracy claim for OTM options and the downstream Wasserstein calibration both rest on the fixed-term sum-of-exponentials approximation controlling truncation error for the kernel K(t)∼t^{H−1/2}. No a priori error bound is supplied, and the experiments do not systematically vary H down to 0.05 or extend T while holding the number of exponential terms constant; this is load-bearing for the reported pricing accuracy and calibration improvements.

Authors: We agree that a rigorous a priori error bound for the truncation error of the fixed-term sum-of-exponentials approximation would strengthen the theoretical foundation. The current manuscript relies on the hybrid construction (exact treatment of the singular kernel on the first step) together with numerical evidence to control the error for the reported regimes. To address the referee's concern, we will add a dedicated subsection in §3 discussing the sources of approximation error and their dependence on H and the number of terms, and we will include additional numerical tests that systematically lower H to 0.05 and increase T while keeping the exponential count fixed. revision: yes
Referee: [§5] §5 (numerical experiments): the reported high accuracy and stable convergence are shown only for the chosen parameter regimes; without an ablation that increases the number of time steps or decreases H while keeping the exponential term count fixed, it is unclear whether the linear-complexity claim continues to deliver Monte-Carlo tolerance accuracy outside the tested window.

Authors: The experiments in §5 demonstrate stable convergence and high accuracy for the parameter values and time-step counts typical in practical rBergomi applications. We acknowledge that broader ablation studies would more convincingly establish robustness of the linear-complexity claim. In the revised manuscript we will augment §5 with additional tables and figures that increase the number of time steps and decrease H (down to 0.05) while holding the number of exponential terms constant, confirming that Monte-Carlo tolerance accuracy is maintained. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation remains self-contained

full rationale

The mSOE Monte Carlo scheme is constructed by explicitly combining an exact singular-kernel treatment on the first time step with a fixed-term sum-of-exponentials approximation thereafter, followed by exact Gaussian simulation of the resulting factors; this definition is independent of any target pricing or calibration outcome. The Wasserstein-1 calibration is likewise formulated directly via the Kantorovich-Rubinstein dual representation comparing terminal distributions, without fitting a parameter that is then re-labeled as a prediction or invoking a self-citation chain whose validity depends on the present results. Numerical accuracy claims rest on reported experiments rather than on any algebraic identity that collapses the method to its inputs by construction. No load-bearing self-citation, uniqueness theorem imported from the authors, or ansatz smuggled via prior work appears in the central derivation.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claims rest on the validity of the sum-of-exponentials kernel approximation and on the assumption that matching terminal distributions via Wasserstein distance yields better model parameters than price fitting; no new physical entities are introduced.

free parameters (1)

number of exponential terms
Fixed by the user to trade accuracy against speed; directly controls the approximation quality of the singular kernel.

axioms (1)

domain assumption The rough Bergomi model dynamics with fractional kernel are correctly specified for the market data considered.
Invoked throughout the pricing and calibration sections as the target process.

pith-pipeline@v0.9.0 · 5546 in / 1278 out tokens · 44515 ms · 2026-05-17T03:32:12.453511+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

inf θ∈Θ 1/M ∑ W1(STj(θ), SMKT_Tj) ... Kantorovich-Rubinstein duality

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

[1]

Abadi, P

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al. Tensorflow: a system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16) , pages 265–283, 2016

work page 2016
[2]

Abi Jaber and O

E. Abi Jaber and O. El Euch. Multifactor approximation of rough volatility models. SIAM Journal on Financial Mathematics , 10(2):309–349, 2019

work page 2019
[3]

Baschetti, G

F. Baschetti, G. Bormetti, and P. Rossi. Deep calibration with random grids. Quanti- tative Finance, pages 1–23, 2024

work page 2024
[4]

Bayer and S

C. Bayer and S. Breneis. Markovian approximations of stochastic Volterra equations with the fractional kernel. Quantitative Finance, 23(1):53–70, 2023

work page 2023
[5]

Bayer and S

C. Bayer and S. Breneis. Weak Markovian approximations of rough Heston. arXiv preprint arXiv:2309.07023, 2023

work page arXiv 2023
[6]

Bayer, P

C. Bayer, P. Friz, and J. Gatheral. Pricing under rough volatility. Quantitative Finance, 16(6):887–904, 2016

work page 2016
[7]

Bennedsen, A

M. Bennedsen, A. Lunde, and M. S. Pakkanen. Hybrid scheme for Brownian semista- tionary processes. Finance and Stochastics, 21:931–965, 2017

work page 2017
[8]

D. Braess. Nonlinear approximation theory. Springer Science & Business Media, 2012

work page 2012
[9]

De Angelis and A

M. De Angelis and A. Gray. Why the 1-Wasserstein distance is the area between the two marginal cdfs. arXiv preprint arXiv:2111.03570 , 2021

work page arXiv 2021
[10]

T. DeLise. Neural options pricing. Preprint, arXiv:2105.13320, 2021

work page arXiv 2021
[11]

Figlewski

S. Figlewski. Risk-neutral densities: A review. Annual Review of Financial Economics , 10(1):329–359, 2018

work page 2018
[12]

P. Gassiat. On the martingale property in the rough Bergomi model. 2019

work page 2019
[13]

Gatheral, T

J. Gatheral, T. Jaisson, and M. Rosenbaum. Volatility is rough. In Commodities, pages 659–690. Chapman and Hall/CRC, 2022

work page 2022
[14]

Gulisashvili

A. Gulisashvili. Gaussian stochastic volatility models: Scaling regimes, large deviations, and moment explosions. Stochastic Processes and their Applications, 130(6):3648–3686, 2020. 27

work page 2020
[15]

P. Harms. Strong convergence rates for Markovian representations of fractional Brownian motion, 2019

work page 2019
[16]

Horvath, A

B. Horvath, A. Muguruza, and M. Tomas. Deep learning volatility: a deep neural network perspective on pricing and calibration in (rough) volatility models. Quantitative Finance, 21(1):11–27, 2021

work page 2021
[17]

Jiang, J

S. Jiang, J. Zhang, Q. Zhang, and Z. Zhang. Fast evaluation of the Caputo fractional derivative and its applications to fractional diffusion equations. Communications in Computational Physics, 21(3):650–678, 2017

work page 2017
[18]

Kidger, J

P. Kidger, J. Foster, X. Li, and T. J. Lyons. Neural SDEs as infinite-dimensional GANs. In International conference on machine learning , pages 5453–5463. PMLR, 2021

work page 2021
[19]

S. Liu, A. Borovykh, L. A. Grzelak, and C. W. Oosterlee. A neural network-based framework for financial model calibration. Journal of Mathematics in Industry , 9(1):9, 2019

work page 2019
[20]

C. R. Nelson and A. F. Siegel. Parsimonious modeling of yield curves. Journal of business, pages 473–489, 1987

work page 1987
[21]

S. E. Rømer. Hybrid multifactor scheme for stochastic Volterra equations with com- pletely monotone kernels. Available at SSRN 3706253 , 2022

work page 2022
[22]

A. Tong, T. Nguyen-Tang, T. Tran, and J. Choi. Learning fractional white noises in neural stochastic differential equations. In Advances in Neural Information Processing Systems, volume 35, pages 37660–37675, 2022

work page 2022
[23]

C. Villani. Topics in optimal transportation. American Mathematical Soc., Providence, 2021

work page 2021
[24]

D. V. Widder. The Laplace Transform, volume vol. 6 of Princeton Mathematical Series. Princeton University Press, Princeton, NJ, 1941

work page 1941
[25]

Q. Zhu, G. Loeper, W. Chen, and N. Langren´ e. Markovian approximation of the rough Bergomi model for Monte Carlo option pricing. Mathematics, 9(5):528, 2021. 28

work page 2021

[1] [1]

Abadi, P

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al. Tensorflow: a system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16) , pages 265–283, 2016

work page 2016

[2] [2]

Abi Jaber and O

E. Abi Jaber and O. El Euch. Multifactor approximation of rough volatility models. SIAM Journal on Financial Mathematics , 10(2):309–349, 2019

work page 2019

[3] [3]

Baschetti, G

F. Baschetti, G. Bormetti, and P. Rossi. Deep calibration with random grids. Quanti- tative Finance, pages 1–23, 2024

work page 2024

[4] [4]

Bayer and S

C. Bayer and S. Breneis. Markovian approximations of stochastic Volterra equations with the fractional kernel. Quantitative Finance, 23(1):53–70, 2023

work page 2023

[5] [5]

Bayer and S

C. Bayer and S. Breneis. Weak Markovian approximations of rough Heston. arXiv preprint arXiv:2309.07023, 2023

work page arXiv 2023

[6] [6]

Bayer, P

C. Bayer, P. Friz, and J. Gatheral. Pricing under rough volatility. Quantitative Finance, 16(6):887–904, 2016

work page 2016

[7] [7]

Bennedsen, A

M. Bennedsen, A. Lunde, and M. S. Pakkanen. Hybrid scheme for Brownian semista- tionary processes. Finance and Stochastics, 21:931–965, 2017

work page 2017

[8] [8]

D. Braess. Nonlinear approximation theory. Springer Science & Business Media, 2012

work page 2012

[9] [9]

De Angelis and A

M. De Angelis and A. Gray. Why the 1-Wasserstein distance is the area between the two marginal cdfs. arXiv preprint arXiv:2111.03570 , 2021

work page arXiv 2021

[10] [10]

T. DeLise. Neural options pricing. Preprint, arXiv:2105.13320, 2021

work page arXiv 2021

[11] [11]

Figlewski

S. Figlewski. Risk-neutral densities: A review. Annual Review of Financial Economics , 10(1):329–359, 2018

work page 2018

[12] [12]

P. Gassiat. On the martingale property in the rough Bergomi model. 2019

work page 2019

[13] [13]

Gatheral, T

J. Gatheral, T. Jaisson, and M. Rosenbaum. Volatility is rough. In Commodities, pages 659–690. Chapman and Hall/CRC, 2022

work page 2022

[14] [14]

Gulisashvili

A. Gulisashvili. Gaussian stochastic volatility models: Scaling regimes, large deviations, and moment explosions. Stochastic Processes and their Applications, 130(6):3648–3686, 2020. 27

work page 2020

[15] [15]

P. Harms. Strong convergence rates for Markovian representations of fractional Brownian motion, 2019

work page 2019

[16] [16]

Horvath, A

B. Horvath, A. Muguruza, and M. Tomas. Deep learning volatility: a deep neural network perspective on pricing and calibration in (rough) volatility models. Quantitative Finance, 21(1):11–27, 2021

work page 2021

[17] [17]

Jiang, J

S. Jiang, J. Zhang, Q. Zhang, and Z. Zhang. Fast evaluation of the Caputo fractional derivative and its applications to fractional diffusion equations. Communications in Computational Physics, 21(3):650–678, 2017

work page 2017

[18] [18]

Kidger, J

P. Kidger, J. Foster, X. Li, and T. J. Lyons. Neural SDEs as infinite-dimensional GANs. In International conference on machine learning , pages 5453–5463. PMLR, 2021

work page 2021

[19] [19]

S. Liu, A. Borovykh, L. A. Grzelak, and C. W. Oosterlee. A neural network-based framework for financial model calibration. Journal of Mathematics in Industry , 9(1):9, 2019

work page 2019

[20] [20]

C. R. Nelson and A. F. Siegel. Parsimonious modeling of yield curves. Journal of business, pages 473–489, 1987

work page 1987

[21] [21]

S. E. Rømer. Hybrid multifactor scheme for stochastic Volterra equations with com- pletely monotone kernels. Available at SSRN 3706253 , 2022

work page 2022

[22] [22]

A. Tong, T. Nguyen-Tang, T. Tran, and J. Choi. Learning fractional white noises in neural stochastic differential equations. In Advances in Neural Information Processing Systems, volume 35, pages 37660–37675, 2022

work page 2022

[23] [23]

C. Villani. Topics in optimal transportation. American Mathematical Soc., Providence, 2021

work page 2021

[24] [24]

D. V. Widder. The Laplace Transform, volume vol. 6 of Princeton Mathematical Series. Princeton University Press, Princeton, NJ, 1941

work page 1941

[25] [25]

Q. Zhu, G. Loeper, W. Chen, and N. Langren´ e. Markovian approximation of the rough Bergomi model for Monte Carlo option pricing. Mathematics, 9(5):528, 2021. 28

work page 2021