Recognition: no theorem link
Differential Machine Learning for 0DTE Options with Stochastic Volatility and Jumps
Pith reviewed 2026-05-15 14:55 UTC · model grok-4.3
The pith
A three-stage differential machine learning procedure approximates jump terms more accurately in ultra-short maturity options while preserving pricing accuracy and delivering faster Greeks than Fourier methods.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Expressing the option price in Black-Scholes form with a maturity-gated variance correction, supervising both prices and Greeks from one pricing network, and fitting a jump-operator network jointly in a three-stage procedure with a PIDE-residual penalty improves the identifiability and accuracy of the jump term for ultra-short maturities relative to one-stage training while maintaining comparable pricing errors, reducing Greeks errors, yielding stable one-day delta hedges, and providing large speedups over Fourier methods.
What carries the argument
Three-stage joint training of a pricing network and a jump-operator network together with a maturity-gated variance correction inside a Black-Scholes representation and a PIDE-residual penalty term.
If this is right
- Better jump-term recovery allows more reliable decomposition of price changes into diffusive and jump components at very short horizons.
- Lower Greeks errors translate directly into smaller hedging residual variance for daily rebalancing of 0DTE positions.
- Speedups over Fourier inversion make repeated calibration and real-time risk calculations feasible inside stochastic-volatility jump models.
- Inclusion of jump-intensity price sensitivity in the loss further tightens the calibrated parameter fit.
Where Pith is reading between the lines
- The same staged-training pattern could be applied to other path-dependent contracts whose pricing equations contain non-local integral terms.
- If the networks generalize beyond the training measure, they could serve as fast surrogates inside Monte-Carlo engines that simulate many short-maturity paths.
- Empirical tests on actual 0DTE market quotes would show whether the reported stability persists when the true jump distribution differs from the training assumption.
Load-bearing premise
The three-stage joint training of pricing and jump-operator networks together with the maturity-gated variance correction reliably separates jump contributions and yields accurate Greeks for ultra-short maturities without overfitting to the chosen training paths or model parameters.
What would settle it
On held-out 0DTE paths with large jump intensity, the learned jump-operator network produces approximation errors comparable to or larger than a one-stage baseline, or the resulting delta hedges become unstable over a one-day horizon.
read the original abstract
We present a differential machine learning method for zero-days-to-expiry (0DTE) options under a stochastic-volatility jump-diffusion model. To handle the ultra-short-maturity regime, we express the option price in Black-Scholes form with a maturity-gated variance correction, combining supervision on prices and Greeks with a PIDE-residual penalty. Prices and Greeks are derived from a single trained pricing network, while jump-term identifiability is ensured by a jump-operator network fitted jointly in a three-stage procedure. The method improves jump-term approximation relative to one-stage baselines while maintaining comparable pricing errors. Furthermore, it reduces errors in Greeks, produces stable one-day delta hedges, and offers significant speedups over Fourier-based benchmarks. Calibration experiments demonstrate the network's efficiency as a pricer; notably, incorporating jump-intensity price sensitivity into the learning process further improves the overall model fit.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a differential machine learning method for pricing 0DTE options under a stochastic-volatility jump-diffusion model. It expresses the option price in Black-Scholes form with a maturity-gated variance correction, combines supervision on prices and Greeks with a PIDE-residual penalty, derives prices and Greeks from a single pricing network, and uses a three-stage joint training procedure with a separate jump-operator network to ensure identifiability of jump terms. The central claims are improved jump-term approximation relative to one-stage baselines, reduced errors in Greeks, stable one-day delta hedges, significant speedups over Fourier benchmarks, and better calibration fit when incorporating jump-intensity sensitivities.
Significance. If the quantitative claims hold under independent verification, the work offers a computationally efficient approach to pricing and hedging ultra-short-maturity options with jumps, which is relevant for high-frequency trading and real-time risk management. The three-stage training procedure for separating jump-operator effects and the use of PIDE residuals within a differential ML framework constitute a targeted methodological contribution to computational finance for regimes where standard Fourier methods become expensive.
major comments (3)
- [Abstract and Experiments] Abstract and Experiments section: the claims of improved jump-term approximation and reduced Greeks errors are stated without accompanying quantitative error tables, ablation studies comparing the three-stage procedure to one-stage baselines, or explicit verification that the PIDE residual enforces the model dynamics rather than being absorbed into the network fit.
- [§3.2] §3.2 (three-stage training description): the identifiability of jump terms is asserted to follow from the joint training of pricing and jump-operator networks together with maturity-gated variance correction, yet no diagnostic tests (e.g., sensitivity to training-data distribution or recovery of known jump parameters on synthetic data) are reported to confirm that the procedure does not overfit to the specific model assumptions.
- [Calibration experiments] Calibration experiments: while speedups over Fourier methods are claimed, the manuscript does not report the precise computational timings, the number of calibration iterations, or the out-of-sample pricing error on held-out strikes/maturities that would substantiate the efficiency advantage for practical use.
minor comments (2)
- [§2] Notation for the maturity-gated variance correction and the jump-operator network output should be introduced with explicit equations rather than descriptive prose to improve reproducibility.
- [Abstract] The abstract states that incorporating jump-intensity price sensitivity improves model fit, but the corresponding quantitative improvement (e.g., reduction in calibration RMSE) is not shown in any table or figure.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript accordingly to strengthen the quantitative support and validation of our claims.
read point-by-point responses
-
Referee: [Abstract and Experiments] Abstract and Experiments section: the claims of improved jump-term approximation and reduced Greeks errors are stated without accompanying quantitative error tables, ablation studies comparing the three-stage procedure to one-stage baselines, or explicit verification that the PIDE residual enforces the model dynamics rather than being absorbed into the network fit.
Authors: We agree that the claims would be more robust with explicit quantitative backing. In the revised manuscript we will add error tables quantifying jump-term and Greeks improvements, ablation studies isolating the three-stage procedure versus one-stage baselines, and verification that PIDE residuals remain small and consistent with the model dynamics (e.g., via residual norm statistics on held-out paths). revision: yes
-
Referee: [§3.2] §3.2 (three-stage training description): the identifiability of jump terms is asserted to follow from the joint training of pricing and jump-operator networks together with maturity-gated variance correction, yet no diagnostic tests (e.g., sensitivity to training-data distribution or recovery of known jump parameters on synthetic data) are reported to confirm that the procedure does not overfit to the specific model assumptions.
Authors: We acknowledge that diagnostic evidence would strengthen the identifiability argument. We will add synthetic-data recovery experiments (recovering known jump parameters) and sensitivity tests to training-data distributions in the revised §3.2 to demonstrate that the three-stage procedure does not overfit to the assumed model. revision: yes
-
Referee: [Calibration experiments] Calibration experiments: while speedups over Fourier methods are claimed, the manuscript does not report the precise computational timings, the number of calibration iterations, or the out-of-sample pricing error on held-out strikes/maturities that would substantiate the efficiency advantage for practical use.
Authors: We will include the requested details in the revised calibration section: wall-clock timings for the network versus Fourier benchmarks, the exact number of calibration iterations, and out-of-sample pricing errors on held-out strikes and maturities to substantiate the practical efficiency advantage. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper defines a three-stage training procedure for a pricing network (expressing price via Black-Scholes form plus maturity-gated variance correction) and a separate jump-operator network, with loss terms that include direct supervision on prices/Greeks plus a PIDE residual penalty. These residuals derive from the known SVJ dynamics rather than from the network outputs themselves, and performance is measured against independent Fourier benchmarks on generated data. No step reduces a claimed prediction to a fitted input by construction, no load-bearing self-citation appears, and the ansatz is explicitly introduced as part of the method rather than imported. The central claims (improved jump-term recovery and Greeks relative to one-stage baselines) therefore rest on external numerical validation rather than tautological reduction.
Axiom & Free-Parameter Ledger
free parameters (1)
- neural network weights and hyperparameters
axioms (2)
- domain assumption Black-Scholes form with maturity-gated variance correction accurately represents the 0DTE price under the SVJD model
- ad hoc to paper Three-stage training ensures identifiability of jump terms
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.