pith. sign in

arxiv: 2509.19707 · v2 · pith:HZVQBHIEnew · submitted 2025-09-24 · 📊 stat.ML · cs.LG· stat.CO· stat.ME

Diffusion and Flow-based Copulas: Forgetting and Remembering Dependencies

Pith reviewed 2026-05-21 21:47 UTC · model grok-4.3

classification 📊 stat.ML cs.LGstat.COstat.ME
keywords copulasdiffusion modelsnormalizing flowsmultivariate dependencedensity estimationsamplinghigh-dimensional data
0
0 comments X

The pith

Copula models recover true multivariate dependencies by learning to reverse diffusion and flow processes that forget them.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to build flexible copula models for complex, high-dimensional dependencies by first creating diffusion and flow processes that erase interactions between variables step by step while keeping each variable's own distribution fixed. These processes produce valid copulas at every stage. Models are then trained to remember and restore the erased dependencies, with the result that the learned model recovers the original copula when training succeeds. One version targets direct density estimation and the other focuses on fast sampling, both demonstrating better performance than prior copula methods on scientific data and images.

Core claim

We design two processes that progressively forget inter-variable dependencies while leaving dimension-wise distributions unaffected, provably defining valid copulas at all times. We show how to obtain copula models by learning to remember the forgotten dependencies from each process, theoretically recovering the true copula at optimality.

What carries the argument

The pair of diffusion and flow forgetting processes that progressively remove inter-variable dependencies without altering marginal distributions, reversed by a learned remembering mechanism that recovers the joint dependence structure.

If this is right

  • Superior empirical performance on high-dimensional and multimodal dependencies from scientific datasets and images.
  • One instantiation supports direct density estimation while the other enables efficient sampling.
  • Theoretical recovery of the true copula when the remembering process reaches optimality.
  • Increased representational power that supports scaling copula models to larger and more challenging domains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The forgetting-remembering structure could be adapted to other generative models that need to isolate and then restore specific dependence patterns.
  • Testing on even higher-dimensional problems where traditional copulas fail would clarify the practical limits of the recovery guarantee.
  • The approach might combine with existing density estimators to handle mixed continuous-discrete data without custom copula constructions.

Load-bearing premise

The learned remembering process reaches the true copula without the training objective or optimization introducing persistent biases.

What would settle it

On a synthetic dataset drawn from a known complex copula, train the remembering model to convergence and check whether the generated joint distribution matches the true copula within sampling error.

Figures

Figures reproduced from arXiv: 2509.19707 by David Huk, Theodoros Damoulas.

Figure 1
Figure 1. Figure 1: Overview of proposed copula models. We design forward processes to forget inter￾variable dependencies but preserve dimension-wise marginals. Our classification-diffusion copula and reflection copula learn by remembering the forgotten dependencies of these processes. interpretable way, facilitating the study of neuron spikes in the brain (Berkes et al., 2008; Verzelli & Sacerdote, 2019), increasing the flex… view at source ↗
Figure 2
Figure 2. Figure 2: Classifier-Diffusion Copula. We map copula data u to Gaussian scale data z, to which we apply an Ornstein-Uhlenbeck process up to a time t. We train a multinomial classifier cdc to identify the diffusion time t based on z’s dependence. This classifier recovers the copula density. 3 CLASSIFICATION-DIFFUSION COPULA In this Section, we develop classification-diffusion copulas whose strength lies in performing… view at source ↗
Figure 3
Figure 3. Figure 3: Reflection copula design. (Left panel) Copula data in black are given red velocities to move with time. (Middle panel) Trajectories are reflected from R d to [0, 1]d following mirrored blue outlines. (Third panel) Reflected trajectories diffuse the copula with time. The reflection copula then learns a velocity predictor v(u, t) := E[vt|u = u] for the average velocity, needed for sampling. As many copula ap… view at source ↗
Figure 4
Figure 4. Figure 4: Reflection copula sampling. Initialised from uniform samples at t = T, following −v ∗ (u, t) preserves marginals and generates u0 ∼ c(u) at t = 0. In other words, the expected velocity of this process is enough to generate samples. However, we first require a sample uT at time T from our process. As we show in Proposition 7, uT ∼ U[0, 1]d for T → ∞. Therefore, we initialise Eq. (7) with uni￾form samples fo… view at source ↗
Figure 5
Figure 5. Figure 5: Copula image samples: Data and copula scale MNIST samples (top rows), copula scale Cifar samples (bottom rows). Only our designs accurately represent the complex dependencies. To demonstrate our methods’ scalability, we require datasets with pre-estimated marginal distribu￾tions. As this scale of experiments has not been fully studied in previous works, we rely on image datasets where CDFs are trivial and … view at source ↗
Figure 6
Figure 6. Figure 6: Simulation study: Plots of u 1 , u 2 copula samples with colours representing LLs. 27 [PITH_FULL_IMAGE:figures/full_fig_p027_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Simulation study: We show the distribution of errors for our cdc and the ratio copula. Overall, our model achieves more accurate density estimates, with errors of smaller magnitudes. 28 [PITH_FULL_IMAGE:figures/full_fig_p028_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Convergence to uniformity of forward copula process. Initialised at 1000 copula obser￾vations, we run the processes forward in time and measure the Wasserstein-2 distance with respect to the Uniform distribution. We show the result with one standard deviation in blue across 10 mea￾surements, depicting the fast convergence to uniformity. C.4 UNIFORMITY OF GENERATED SAMPLES For a copula to correctly sample a… view at source ↗
Figure 9
Figure 9. Figure 9: Uniformity of samples for Magic [PITH_FULL_IMAGE:figures/full_fig_p030_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Uniformity of samples for Dry Bean: 30 [PITH_FULL_IMAGE:figures/full_fig_p030_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Uniformity of samples for Robocup [PITH_FULL_IMAGE:figures/full_fig_p031_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Uniformity of samples for digits [PITH_FULL_IMAGE:figures/full_fig_p031_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Uniformity of samples for MNIST: 31 [PITH_FULL_IMAGE:figures/full_fig_p031_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Uniformity of samples for Cifar: C.5 VISUALISING SAMPLES Visualising scientific datasets. We show pair plots for each dataset below, obtained from 1000 samples from our model, shown in the lower left, compared to 1000 randomly drawn observations from the test set (only one of our 10 runs is shown). We note that pair plots are unable to convey higher-order dependencies; they only provide an aggregate view … view at source ↗
Figure 15
Figure 15. Figure 15: Pair plots of copula samples vs observations. In the lower left of each plot, we show our copula samples u with the reflection copula in blue in the left column and our cdc copula in green in the right column. Observed data is shown in the upper right of each plot in grey. Note that instead of being mirrored, the dimension pairs are reversed for the bottom-left parts of the plots to match the orientation … view at source ↗
Figure 16
Figure 16. Figure 16: Samples for MNIST: Random samples from all methods, shown on the copula [0, 1]784 scale with lighter values being closer to 1. Notice how most of the variation takes place in the centre, where the shape of the number appears as high u values. However, image edges are not very dependent and take random u values. 34 [PITH_FULL_IMAGE:figures/full_fig_p034_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Samples for MNIST: Random samples from all methods, shown on the data R 784 scale with lighter values being higher. Our reflection and cdc copulas correctly produce samples resem￾bling digits, while competing copulas struggle to produce coherent samples. 35 [PITH_FULL_IMAGE:figures/full_fig_p035_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Samples for Cifar: Random samples from all methods, shown on the copula [0, 1]1024 scale with lighter values being closer to 1. Notice how here, all dimensions of the image are depen￾dent and matter in producing coherent samples. As a consequence, the images are already visible on the copula scale without needing to transform them to the data scale. 36 [PITH_FULL_IMAGE:figures/full_fig_p036_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: 10-nearest neighbours of generated samples: For random samples from our models, we show the 10 nearest neighbours with respect to the Euclidean distance.It is apparent that the generated samples differ from the observations, meaning the models did not simply remember the training dataset. Best viewed digitally. 37 [PITH_FULL_IMAGE:figures/full_fig_p037_19.png] view at source ↗
read the original abstract

Copulas are a fundamental tool for modelling multivariate dependencies in data, forming the method of choice in diverse fields and applications. However, the adoption of existing models for multimodal and high-dimensional dependencies is hindered by restrictive assumptions and poor scaling. In this work, we present methods for modelling copulas based on the principles of diffusions and flows. We design two processes that progressively forget inter-variable dependencies while leaving dimension-wise distributions unaffected, provably defining valid copulas at all times. We show how to obtain copula models by learning to remember the forgotten dependencies from each process, theoretically recovering the true copula at optimality. The first instantiation of our framework focuses on direct density estimation, while the second specialises in expedient sampling. Empirically, we demonstrate the superior performance of our proposed methods over state-of-the-art copula approaches in modelling complex and high-dimensional dependencies from scientific datasets and images. Our work enhances the representational power of copula models, empowering applications and paving the way for their adoption on larger scales and more challenging domains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces diffusion- and flow-based methods for copula modeling. It designs two processes that progressively forget inter-variable dependencies while exactly preserving marginal distributions, thereby yielding valid copulas at every intermediate step. Copula models are then obtained by training a network to reverse the forgetting process and recover the original dependencies; the authors claim that this recovers the true copula at optimality. Two concrete instantiations are given—one for direct density estimation and one specialized for sampling—together with empirical comparisons showing improved performance over existing copula models on high-dimensional scientific data and image datasets.

Significance. If the theoretical guarantees hold, the framework would meaningfully extend the representational capacity of copulas to multimodal and high-dimensional settings by leveraging the flexibility of score/flow matching, while retaining the marginal-separation property that makes copulas attractive. The dual density-estimation and sampling pathways, together with the explicit construction of time-dependent valid copulas, constitute a concrete advance over prior restrictive parametric copulas.

major comments (2)
  1. [§3 and §4] §3 (forgetting processes) and §4 (recovery theorem): the claim that the processes 'provably define valid copulas at all times' and that learning recovers the true copula at optimality rests on the marginal distributions remaining exactly invariant and on the training objective being strictly proper for the conditional copula density. The manuscript does not supply an explicit derivation showing that the chosen denoising/score-matching loss is minimized uniquely by the true conditional copula (rather than by a mode-seeking or biased approximation) nor that the chosen parameterization class can represent it without systematic bias.
  2. [§4.2] §4.2 (optimality argument): the statement that the learned remembering process 'theoretically recover[s] the true copula at optimality' assumes global convergence to the unique minimizer of the loss. No analysis is provided of whether the objective is convex in the relevant function space or whether the high-dimensional optimization is free of the usual pathologies (local minima, mode collapse) that would prevent exact recovery in practice.
minor comments (2)
  1. [§2] Notation for the time-dependent copula density and the forgetting schedule should be introduced once and used consistently; several symbols are redefined without cross-reference.
  2. [Figure 2] Figure 2 (process trajectories) would benefit from an additional panel showing the marginal histograms at intermediate times to visually confirm invariance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback. We address each major comment below, clarifying the theoretical foundations while acknowledging where additional exposition strengthens the manuscript. Revisions will be incorporated in the next version.

read point-by-point responses
  1. Referee: [§3 and §4] §3 (forgetting processes) and §4 (recovery theorem): the claim that the processes 'provably define valid copulas at all times' and that learning recovers the true copula at optimality rests on the marginal distributions remaining exactly invariant and on the training objective being strictly proper for the conditional copula density. The manuscript does not supply an explicit derivation showing that the chosen denoising/score-matching loss is minimized uniquely by the true conditional copula (rather than by a mode-seeking or biased approximation) nor that the chosen parameterization class can represent it without systematic bias.

    Authors: We appreciate this observation. The validity of intermediate copulas follows from the explicit construction: the forgetting processes are defined to leave each marginal distribution invariant (Propositions 1 and 2) while progressively removing dependence, which by Sklar’s theorem yields a valid copula at every time. For recovery, the objective is the standard denoising score-matching (or flow-matching) loss applied to the conditional copula density; this loss is known to be strictly proper, with the unique minimizer being the true score when the model class is sufficiently rich. We acknowledge that an explicit, self-contained derivation tailored to the copula setting is missing. In revision we will add an appendix containing (i) the proof that the loss is uniquely minimized by the true conditional copula density and (ii) a discussion of the universal-approximation properties of the chosen network architectures together with practical safeguards against systematic bias. revision: yes

  2. Referee: [§4.2] §4.2 (optimality argument): the statement that the learned remembering process 'theoretically recover[s] the true copula at optimality' assumes global convergence to the unique minimizer of the loss. No analysis is provided of whether the objective is convex in the relevant function space or whether the high-dimensional optimization is free of the usual pathologies (local minima, mode collapse) that would prevent exact recovery in practice.

    Authors: We agree that the optimality claim is conditional on reaching the global minimizer. The manuscript states recovery “at optimality,” which we interpret as the population minimizer of a strictly proper loss; we do not claim that stochastic gradient descent on finite data necessarily attains this point. Full convexity analysis in the infinite-dimensional function space is intractable for neural-network parameterizations, a limitation shared by essentially all modern diffusion and flow models. In the revision we will expand §4.2 with a discussion of the optimization landscape, citing related results from the diffusion literature on local-minima behavior and mode-covering properties of score/flow matching, and we will report additional diagnostics (e.g., training-loss curves and multiple random seeds) to illustrate practical convergence on the datasets considered. revision: partial

Circularity Check

0 steps flagged

Derivation chain is self-contained; recovery at optimality follows from process invertibility and standard matching objectives.

full rationale

The paper first constructs explicit forgetting processes that leave marginals invariant and provably produce valid copulas at every timestep (via direct verification of the copula definition). It then defines a remembering process whose objective is standard conditional score or flow matching on the known conditional induced by the forgetting step. At optimality the learned model recovers the true conditional by the usual properties of strictly proper scoring rules for densities or flows; this does not reduce to a fitted parameter being renamed, nor does it rest on a self-citation chain or an ansatz smuggled from prior work. No equation equates the target copula to itself by construction, and the empirical results on external datasets provide an independent check. The derivation therefore remains non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; full text would be required to audit these.

pith-pipeline@v0.9.0 · 5713 in / 1083 out tokens · 44175 ms · 2026-05-21T21:47:30.248098+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

74 extracted references · 74 canonical work pages · 3 internal anchors

  1. [1]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

  2. [2]

    Pair-copula constructions of multiple dependence

    Kjersti Aas, Claudia Czado, Arnoldo Frigessi, and Henrik Bakken. Pair-copula constructions of multiple dependence. Insurance: Mathematics and economics, 44 0 (2): 0 182--198, 2009

  3. [3]

    Convergence of dif- fusion models under the manifold hypothesis in high-dimensions.arXiv preprint arXiv:2409.18804,

    Iskander Azangulov, George Deligiannidis, and Judith Rousseau. Convergence of diffusion models under the manifold hypothesis in high-dimensions. arXiv preprint arXiv:2409.18804, 2024

  4. [4]

    Characterizing neural dependencies with copula models

    Pietro Berkes, Frank Wood, and Jonathan Pillow. Characterizing neural dependencies with copula models. Advances in neural information processing systems, 21, 2008

  5. [5]

    A copula-based method to build diffusion models with prescribed marginal and serial dependence

    Enrico Bibbona, Laura Sacerdote, and Emiliano Torre. A copula-based method to build diffusion models with prescribed marginal and serial dependence. Methodology and computing in applied probability, 18: 0 765--783, 2016

  6. [6]

    Non-asymptotic bounds for forward processes in de- noising diffusions: Ornstein-Uhlenbeck is hard to beat.arXiv preprint arXiv:2408.13799,

    Miha Bre s ar and Aleksandar Mijatovi \'c . Non-asymptotic bounds for forward processes in denoising diffusions: Ornstein-uhlenbeck is hard to beat. arXiv preprint arXiv:2408.13799, 2024

  7. [7]

    Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions

    Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, and Anru R Zhang. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions. In International Conference on Learning Representations, 2023

  8. [8]

    Neural likelihoods via cumulative distribution functions

    Pawel Chilinski and Ricardo Silva. Neural likelihoods via cumulative distribution functions. In Conference on uncertainty in artificial intelligence, pp.\ 420--429. PMLR, 2020

  9. [9]

    Featurized density ratio estimation

    Kristy Choi, Madeline Liao, and Stefano Ermon. Featurized density ratio estimation. In Uncertainty in Artificial Intelligence, pp.\ 172--182. PMLR, 2021

  10. [10]

    Scaling hamiltonian monte carlo inference for bayesian neural networks with symmetric splitting

    Adam D Cobb and Brian Jalaian. Scaling hamiltonian monte carlo inference for bayesian neural networks with symmetric splitting. Uncertainty in Artificial Intelligence, 2021

  11. [11]

    The interdependence between rainfall and temperature: copula analyses

    Rong-Gang Cong and Mark Brady. The interdependence between rainfall and temperature: copula analyses. The Scientific World Journal, 2012 0 (1): 0 405675, 2012

  12. [12]

    Vine copula based dependence modeling in sustainable finance

    Claudia Czado, Karoline Bax, \"O zge Sahin, Thomas Nagler, Aleksey Min, and Sandra Paterlini. Vine copula based dependence modeling in sustainable finance. The Journal of Finance and Data Science, 8: 0 309--330, 2022

  13. [13]

    Numerically stable generation of correlation matrices and their factors

    Philip I Davies and Nicholas J Higham. Numerically stable generation of correlation matrices and their factors. BIT Numerical Mathematics, 40 0 (4): 0 640--651, 2000

  14. [14]

    Minimum scoring rule inference

    A Philip Dawid, Monica Musio, and Laura Ventura. Minimum scoring rule inference. Scandinavian Journal of Statistics, 43 0 (1): 0 123--138, 2016

  15. [15]

    Copula modelling to analyse financial data

    Paul R Dewick and Shuangzhe Liu. Copula modelling to analyse financial data. Journal of Risk and Financial Management, 15 0 (3): 0 104, 2022

  16. [16]

    Copula B ayesian networks

    Gal Elidan. Copula B ayesian networks. Advances in neural information processing systems, 23, 2010

  17. [17]

    Parameterizing and simulating from causal models

    Robin J Evans and Vanessa Didelez. Parameterizing and simulating from causal models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 86 0 (3): 0 535--568, 2024

  18. [18]

    Lebesguessche konstanten und divergente F ourierreihen

    Leopold Fej \'e r. Lebesguessche konstanten und divergente F ourierreihen. Journal für die reine und angewandte Mathematik, 1910

  19. [19]

    Copula modeling for discrete random vectors

    Gery Geenens. Copula modeling for discrete random vectors. Dependence Modeling, 8 0 (1): 0 417--440, 2020

  20. [20]

    Towards a universal representation of statistical dependence

    Gery Geenens. Towards a universal representation of statistical dependence. arXiv preprint arXiv:2302.08151, 2023

  21. [21]

    Tvinesynth: A truncated c-vine copula generator of synthetic tabular data to balance privacy and utility

    Elisabeth Griesbauer, Claudia Czado, Arnoldo Frigessi, and Ingrid Hob k Haff. Tvinesynth: A truncated c-vine copula generator of synthetic tabular data to balance privacy and utility. In Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, volume 258 of Proceedings of Machine Learning Research, pp.\ 3511--3519. PMLR,...

  22. [22]

    Interpretation of rank histograms for verifying ensemble forecasts

    Thomas M Hamill. Interpretation of rank histograms for verifying ensemble forecasts. Monthly Weather Review, 129 0 (3): 0 550--560, 2001

  23. [23]

    Gans trained by a two time-scale update rule converge to a local N ash equilibrium

    Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local N ash equilibrium. Advances in neural information processing systems, 30, 2017

  24. [24]

    Denoising diffusion probabilistic models

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33: 0 6840--6851, 2020

  25. [25]

    Elements of copula modeling with R

    Marius Hofert, Ivan Kojadinovic, Martin M \"a chler, and Jun Yan. Elements of copula modeling with R. Springer, 2018

  26. [26]

    Quasi-random sampling for multivariate distributions via generative neural networks

    Marius Hofert, Avinash Prasad, and Mu Zhu. Quasi-random sampling for multivariate distributions via generative neural networks. Journal of Computational and Graphical Statistics, 30 0 (3): 0 647--670, 2021

  27. [27]

    Hamiltonian score matching and generative flows

    Peter Holderrieth, Yilun Xu, and Tommi Jaakkola. Hamiltonian score matching and generative flows. Advances in Neural Information Processing Systems, 37: 0 110464--110493, 2024

  28. [28]

    Probabilistic rainfall downscaling: Joint generalized neural models with censored spatial gaussian copula

    David Huk, Rilwan A Adewoyin, and Ritabrata Dutta. Probabilistic rainfall downscaling: Joint generalized neural models with censored spatial gaussian copula. arXiv preprint arXiv:2308.09827, 2023

  29. [29]

    Quasi- B ayes meets vines

    David Huk, Yuanhe Zhang, Ritabrata Dutta, and Mark Steel. Quasi- B ayes meets vines. Advances in Neural Information Processing Systems, 37: 0 40359--40392, 2024

  30. [30]

    Your copula is a classifier in disguise: classification-based copula density estimation

    David Huk, Mark Steel, and Ritabrata Dutta. Your copula is a classifier in disguise: classification-based copula density estimation. In Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, volume 258 of Proceedings of Machine Learning Research, pp.\ 3790--3798. PMLR, 03--05 May 2025. URL https://proceedings.mlr.press...

  31. [31]

    Implicit generative copulas

    Tim Janke, Mohamed Ghanmi, and Florian Steinke. Implicit generative copulas. Advances in Neural Information Processing Systems, 34: 0 26028--26039, 2021

  32. [32]

    Variational diffusion models

    Diederik Kingma, Tim Salimans, Ben Poole, and Jonathan Ho. Variational diffusion models. Advances in neural information processing systems, 34: 0 21696--21707, 2021

  33. [33]

    Adam: A Method for Stochastic Optimization

    Diederik P Kingma. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014

  34. [34]

    Selecting copulas for risk management

    Erik Kole, Kees Koedijk, and Marno Verbeek. Selecting copulas for risk management. Journal of Banking & Finance, 31 0 (8): 0 2405--2423, 2007

  35. [35]

    Deep archimedean copulas

    Chun Kai Ling, Fei Fang, and J Zico Kolter. Deep archimedean copulas. Advances in Neural Information Processing Systems, 33: 0 1535--1545, 2020

  36. [36]

    Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, and Matthew Le. Flow matching for generative modeling. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=PqvMRDCJT9t

  37. [37]

    Discrete copula diffusion

    Anji Liu, Oliver Broadrick, Mathias Niepert, and Guy Van den Broeck. Discrete copula diffusion. arXiv preprint arXiv:2410.01949, 2024

  38. [38]

    Hacsurv: A hierarchical copula-based approach for survival analysis with dependent competing risks

    Xin Liu, Weijia Zhang, and Min-Ling Zhang. Hacsurv: A hierarchical copula-based approach for survival analysis with dependent competing risks. In International Conference on Artificial Intelligence and Statistics, pp.\ 3079--3087. PMLR, 2025

  39. [39]

    Robocupsimdata: A robocup soccer research dataset

    Olivia Michael, Oliver Obst, Falk Schmidsberger, and Frieder Stolzenburg. Robocupsimdata: A robocup soccer research dataset. arXiv preprint arXiv:1711.01703, 2017

  40. [40]

    An empirical B ayes estimator of the mean of a normal population

    Koichi Miyasawa et al. An empirical B ayes estimator of the mean of a normal population. Bull. Inst. Internat. Statist, 38 0 (181-188): 0 1--2, 1961

  41. [41]

    Evading the curse of dimensionality in nonparametric density estimation with simplified vine copulas

    Thomas Nagler and Claudia Czado. Evading the curse of dimensionality in nonparametric density estimation with simplified vine copulas. Journal of Multivariate Analysis, 151: 0 69--89, 2016

  42. [42]

    Nonparametric estimation of simplified vine copula models: comparison of methods

    Thomas Nagler, Christian Schellhase, and Claudia Czado. Nonparametric estimation of simplified vine copula models: comparison of methods. Dependence Modeling, 5 0 (1): 0 99--120, 2017

  43. [43]

    Improving probabilistic diffusion models with optimal diagonal covariance matching

    Zijing Ou, Mingtian Zhang, Andi Zhang, Tim Z Xiao, Yingzhen Li, and David Barber. Improving probabilistic diffusion models with optimal diagonal covariance matching. In The Thirteenth International Conference on Learning Representations, 2025

  44. [44]

    Copula based trainable calibration error estimator of multi-label classification with label interdependencies

    Arkapal Panda and Utpal Garain. Copula based trainable calibration error estimator of multi-label classification with label interdependencies. In Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, volume 258 of Proceedings of Machine Learning Research, pp.\ 3745--3753. PMLR, 03--05 May 2025. URL https://proceedings...

  45. [45]

    Semiparametric conformal prediction

    Ji Won Park and Kyunghyun Cho. Semiparametric conformal prediction. In Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, volume 258 of Proceedings of Machine Learning Research, pp.\ 3880--3888. PMLR, 03--05 May 2025. URL https://proceedings.mlr.press/v258/park25c.html

  46. [46]

    Botied: Multi-objective B ayesian optimization with tied multivariate ranks

    Ji Won Park, Natasa Tagasovska, Michael Maser, Stephen Ra, and Kyunghyun Cho. Botied: Multi-objective B ayesian optimization with tied multivariate ranks. In International Conference on Machine Learning, pp.\ 39813--39833. PMLR, 2024

  47. [47]

    The synthetic data vault

    Neha Patki, Roy Wedge, and Kalyan Veeramachaneni. The synthetic data vault. In 2016 IEEE international conference on data science and advanced analytics (DSAA), pp.\ 399--410. IEEE, 2016

  48. [48]

    A mean field theory learning algorithm for neural network

    Carsten Peterson. A mean field theory learning algorithm for neural network. Complex systems, 1: 0 995--1019, 1987

  49. [49]

    Diffusion models for gaussian distributions: Exact solutions and wasserstein errors

    Emile Pierret and Bruno Galerne. Diffusion models for gaussian distributions: Exact solutions and wasserstein errors. In Forty-second International Conference on Machine Learning, 2025

  50. [50]

    Searching for Activation Functions

    Prajit Ramachandran, Barret Zoph, and Quoc V Le. Searching for activation functions. arXiv preprint arXiv:1710.05941, 2017

  51. [51]

    To smooth a cloud or to pin it down: Expressiveness guarantees and insights on score matching in denoising diffusion models

    Teodora Reu, Francisco Vargas, Anna Kerekes, and Michael M Bronstein. To smooth a cloud or to pin it down: Expressiveness guarantees and insights on score matching in denoising diffusion models. In The 40th Conference on Uncertainty in Artificial Intelligence, 2024

  52. [52]

    Telescoping density-ratio estimation

    Benjamin Rhodes, Kai Xu, and Michael U Gutmann. Telescoping density-ratio estimation. Advances in neural information processing systems, 33: 0 4905--4916, 2020

  53. [53]

    The diffusion duality

    Subham Sekhar Sahoo, Justin Deschenaux, Aaron Gokaslan, Guanghan Wang, Justin T Chiu, and Volodymyr Kuleshov. The diffusion duality. In Forty-second International Conference on Machine Learning, 2025. URL https://openreview.net/forum?id=9P9Y8FOSOk

  54. [54]

    High-dimensional multivariate forecasting with low-rank gaussian copula processes

    David Salinas, Michael Bohlke-Schneider, Laurent Callot, Roberto Medico, and Jan Gasthaus. High-dimensional multivariate forecasting with low-rank gaussian copula processes. Advances in neural information processing systems, 32, 2019

  55. [55]

    On the use of copulas in hydrology: theory and practice

    Gianfausto Salvadori and Carlo De Michele. On the use of copulas in hydrology: theory and practice. Journal of Hydrologic Engineering, 12 0 (4): 0 369--380, 2007

  56. [56]

    Applied stochastic differential equations, volume 10

    Simo S \"a rkk \"a and Arno Solin. Applied stochastic differential equations, volume 10. Cambridge University Press, 2019

  57. [57]

    Fonctions de r \'e partition \`a n dimensions et leurs marges

    M Sklar. Fonctions de r \'e partition \`a n dimensions et leurs marges. In Annales de l'ISUP, volume 8, pp.\ 229--231, 1959

  58. [58]

    Denoising Diffusion Implicit Models

    Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020

  59. [59]

    Generative modeling by estimating gradients of the data distribution

    Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, 32, 2019

  60. [60]

    Score-based generative modeling through stochastic differential equations

    Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=PxTIG12RRHS

  61. [61]

    Estimating the density ratio between distributions with high discrepancy using multinomial logistic regression

    Akash Srivastava, Seungwook Han, Kai Xu, Benjamin Rhodes, and Michael U Gutmann. Estimating the density ratio between distributions with high discrepancy using multinomial logistic regression. Transactions on Machine Learning Research, 2023

  62. [62]

    Copula conformal prediction for multi-step time series prediction

    Sophia Huiwen Sun and Rose Yu. Copula conformal prediction for multi-step time series prediction. In The Twelfth International Conference on Learning Representations, 2023

  63. [63]

    Learning vine copula models for synthetic data generation

    Yi Sun, Alfredo Cuesta-Infante, and Kalyan Veeramachaneni. Learning vine copula models for synthetic data generation. In Proceedings of the aaai conference on artificial intelligence, volume 33, pp.\ 5049--5057, 2019

  64. [64]

    Copulas as high-dimensional generative models: Vine copula autoencoders

    Natasa Tagasovska, Damien Ackerer, and Thibault Vatter. Copulas as high-dimensional generative models: Vine copula autoencoders. Advances in neural information processing systems, 32, 2019

  65. [65]

    Retrospective uncertainties for deep models using vine copulas

    Natasa Tagasovska, Firat Ozdemir, and Axel Brando. Retrospective uncertainties for deep models using vine copulas. In International Conference on Artificial Intelligence and Statistics, pp.\ 7528--7539. PMLR, 2023

  66. [66]

    Future multivariate weather generation by combining bartlett-lewis and vine copula models

    Jorn Van de Velde, Matthias Demuzere, Bernard De Baets, and Niko Verhoest. Future multivariate weather generation by combining bartlett-lewis and vine copula models. Hydrological Sciences Journal, 68 0 (1): 0 1--15, 2023

  67. [67]

    T Vatter and T Pyvinecopulib Nagler. 0.6. 1, 2022

  68. [68]

    A study of dependency features of spike trains through copulas

    Pietro Verzelli and Laura Sacerdote. A study of dependency features of spike trains through copulas. Biosystems, 184: 0 104014, 2019

  69. [69]

    Multi-agent imitation learning with copulas

    Hongwei Wang, Lantao Yu, Zhangjie Cao, and Stefano Ermon. Multi-agent imitation learning with copulas. In Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13--17, 2021, Proceedings, Part I 21, pp.\ 139--156. Springer, 2021

  70. [70]

    Your diffusion model is secretly a noise classifier and benefits from contrastive training

    Yunshu Wu, Yingtao Luo, Xianghao Kong, Vagelis Papalexakis, and Greg Ver Steeg. Your diffusion model is secretly a noise classifier and benefits from contrastive training. Advances in Neural Information Processing Systems, 37: 0 32370--32399, 2024

  71. [71]

    Classification diffusion models: Revitalizing density ratio estimation

    Shahar Yadin, Noam Elata, and Tomer Michaeli. Classification diffusion models: Revitalizing density ratio estimation. In Advances in Neural Information Processing Systems, volume 37, pp.\ 9837--9863. Curran Associates, Inc., 2024. URL https://proceedings.neurips.cc/paper_files/paper/2024/file/13183a224208671a6fc33ba1aa661ec4-Paper-Conference.pdf

  72. [72]

    @esa (Ref

    \@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

  73. [73]

    \@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

  74. [74]

    @open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...