pith. sign in

arxiv: 1907.03361 · v1 · pith:BWCKIVZ5new · submitted 2019-07-07 · 💻 cs.LG · stat.ML

Copula & Marginal Flows: Disentangling the Marginal from its Joint

Pith reviewed 2026-05-25 01:07 UTC · model grok-4.3

classification 💻 cs.LG stat.ML
keywords copula flowsmarginal flowsgenerative modelsnormalizing flowstail modelingdistributional propertiesdeep generative networks
0
0 comments X

The pith

Copula and marginal flows separate dependence structure from marginal distributions to enable exact tail modeling in generative networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard generative networks face limits on the tails they can express, with derived upper bounds showing that optimal networks often do not exist in certain Lp spaces. The paper proposes copula and marginal generative flows that model the joint dependence via a copula independent of the marginals. Once the copula approximates the uniform distribution, any desired marginal CDF can be imposed exactly. This addresses the lack of exact tail asymptotics and extrapolation in existing deep generative models such as GANs and normalizing flows. Numerical results are presented in support of the new flows.

Core claim

The central claim is that copula and marginal generative flows (CM flows) allow for an exact modeling of the tail and any prior assumption on the CDF up to an approximation of the uniform distribution, in contrast to standard generative networks whose expressible tails are bounded above.

What carries the argument

Copula and marginal generative flows (CM flows), which disentangle the dependence structure captured by the copula from the separate marginal distributions.

If this is right

  • Generative networks have upper bounds on the tails they can express in various situations.
  • In some cases no optimal generative network exists for given tail properties.
  • CM flows permit imposing any desired marginal CDF exactly after sufficient uniform approximation.
  • The approach supports extrapolation of distributional properties like tail asymptotics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The separation could improve accuracy in applications such as financial risk modeling that depend on precise tail probabilities.
  • The technique might be tested on synthetic data with known heavy-tailed marginals to measure extrapolation error.
  • It raises the question of whether similar disentangling can be applied to other generative architectures beyond flows.

Load-bearing premise

The dependence structure between variables can be fully captured by a copula that is independent of the marginal distributions.

What would settle it

Showing that a CM flow with a copula component that closely approximates the uniform distribution still cannot impose a target marginal CDF exactly would falsify the exact modeling claim.

Figures

Figures reproduced from arXiv: 1907.03361 by Magnus Wiese, Ralf Korn, Robert Knobloch.

Figure 1
Figure 1. Figure 1: (A) and (B) depict the densities f(x) of a Gaussian, Exponential, t- and Pareto distribution on the linear and log-scale respectively. it is essential to find a generative network that satisfies Fgθ,i(Z)(x) = FXi (x) (1) for all i = 1, . . . , d1 and |x| 0, since the asymptotic behavior determines the propensity to generate extremal values (see [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: (a) illustrates the theoretical density of the gumbel copula, (b) the density obtained by [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: A commutative diagram of the proposed generative flow. Following this decomposition CM flows are ex￾plicitly constructed by composing a copula flow hη : [0, 1]2 → [0, 1]2 with a marginal flow mθ : [0, 1]2 → R 2 . The marginal flow approxi￾mates the inverse CDFs F −1 X1 , F −1 X2 , whereas the copula flow approximates the generating func￾tion of C := (FX1 (X1), FX2 (X2)). Thus, a CM flow is given by gθ,η(u)… view at source ↗
Figure 5
Figure 5. Figure 5: Frank Copula 6.1 Metrics and Divergences The first performance measure we track is the Jensen-Shannon divergence (JSD) of the targeted copula C and the approximation C˜, which we denote by JSD(C k C˜). Furthermore, to assess whether the marginal distributions generated by the copula flow are uniform we approximate via Monte Carlo for i = 1, 2 and Ak = [(k − 1)/n, k/n), k = 1, . . . , n the metric T(i, n) :… view at source ↗
Figure 6
Figure 6. Figure 6: Pointwise evaluation of the JSD of the Clayton, Frank and Gumbel theoretical and copula [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
read the original abstract

Deep generative networks such as GANs and normalizing flows flourish in the context of high-dimensional tasks such as image generation. However, so far exact modeling or extrapolation of distributional properties such as the tail asymptotics generated by a generative network is not available. In this paper, we address this issue for the first time in the deep learning literature by making two novel contributions. First, we derive upper bounds for the tails that can be expressed by a generative network and demonstrate Lp-space related properties. There we show specifically that in various situations an optimal generative network does not exist. Second, we introduce and propose copula and marginal generative flows (CM flows) which allow for an exact modeling of the tail and any prior assumption on the CDF up to an approximation of the uniform distribution. Our numerical results support the use of CM flows.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims two contributions: (1) derivation of upper bounds on tails expressible by generative networks together with Lp-space properties, including non-existence of optimal networks in various situations; (2) introduction of copula and marginal generative flows (CM flows) that model dependence on the unit cube via a copula while imposing arbitrary marginal CDFs exactly via inverse transforms, thereby achieving exact tail modeling up to the error incurred in approximating the uniform distribution. Numerical experiments are said to support the CM-flow construction.

Significance. If the tail-bound derivations and the exactness claim for CM flows hold, the work would supply a principled mechanism for controlling marginal distributions and tail asymptotics independently in deep generative models, which is relevant for risk-sensitive applications. The construction rests on the standard Sklar decomposition but applies it to flows in a way that evades the stated limitations of non-decomposed networks.

major comments (2)
  1. [Abstract] Abstract: the derivation of upper bounds on tails and the non-existence results for optimal generative networks are asserted, yet no equations, proof sketches, or quantification of the Lp-space properties appear; without these the central claim that standard networks are provably limited cannot be verified.
  2. [CM flows] CM-flow construction (second contribution): the assertion of 'exact modeling of the tail' up to uniform approximation requires an explicit bound showing how the copula-flow approximation error propagates into the marginal tails; absent this control the exactness claim is not established.
minor comments (1)
  1. The abstract would be clearer if it briefly indicated the architecture or training procedure used for the numerical support of CM flows.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address the two major comments point by point below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the derivation of upper bounds on tails and the non-existence results for optimal generative networks are asserted, yet no equations, proof sketches, or quantification of the Lp-space properties appear; without these the central claim that standard networks are provably limited cannot be verified.

    Authors: The abstract is a concise summary of the two contributions. The derivations of the upper bounds on tails expressible by generative networks, the Lp-space properties, and the non-existence of optimal networks in various situations are provided with equations and proof sketches in Section 3 of the manuscript, with complete proofs in the appendix. The central claims are therefore verifiable from the body of the paper rather than the abstract. revision: no

  2. Referee: [CM flows] CM-flow construction (second contribution): the assertion of 'exact modeling of the tail' up to uniform approximation requires an explicit bound showing how the copula-flow approximation error propagates into the marginal tails; absent this control the exactness claim is not established.

    Authors: In the CM-flow construction the marginal CDFs (and therefore their tails) are imposed exactly by the inverse transform applied after the copula flow; the copula approximation error affects only the dependence structure on the unit cube. The joint-tail error is consequently controlled solely by the uniform approximation error. To make this propagation explicit we will add a short lemma with the corresponding bound in the revised version. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's derivation chain consists of two independent contributions: (1) explicit upper bounds on tail behavior for standard generative networks (with proofs that optimal networks may not exist in certain Lp settings), and (2) a new CM-flow construction that applies Sklar's theorem to factor the joint into a copula component modeled by a flow on the unit cube plus exact marginal CDFs imposed via inverse transforms. Neither step reduces its claimed result to a fitted parameter, self-citation, or definitional renaming; the 'exact marginal' property holds by the explicit architectural separation rather than by construction from the target quantity itself. No load-bearing self-citations or ansatz smuggling appear in the stated claims.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on standard copula theory (domain assumption) and the modeling choice that marginals and dependence can be separated without loss of expressivity for tail purposes (ad_hoc_to_paper). No free parameters or invented entities with independent evidence are described in the abstract.

axioms (1)
  • domain assumption Marginal distributions and dependence structure can be modeled independently via copulas without affecting tail asymptotics.
    Invoked to justify the CM-flow split in the second contribution.
invented entities (1)
  • Copula and marginal generative flows (CM flows) no independent evidence
    purpose: Separate modeling of marginal CDFs (including tails) from joint dependence.
    Newly proposed architecture; no external falsifiable evidence supplied in abstract.

pith-pipeline@v0.9.0 · 5668 in / 1322 out tokens · 20866 ms · 2026-05-25T01:07:10.382868+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Valid and Expressive Copulas for Irregular Multivariate Time Series

    cs.LG 2026-05 unverdicted novelty 7.0

    CopFITi is the first marginalization-consistent copula for irregular multivariate time series, using normalizing flows for marginals and a Gaussian mixture copula for dependencies to reach new state-of-the-art joint d...

  2. Extrapolation in Statistical Learning with Extreme Value Theory

    stat.ML 2026-05 unverdicted novelty 2.0

    A survey of recent methods that apply extreme value theory to enable extrapolation in statistical learning and machine learning.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · cited by 2 Pith papers · 5 internal anchors

  1. [1]

    Pair-Copula Constructions of Multiple Dependence

    Kjersti Aas et al. “Pair-Copula Constructions of Multiple Dependence”. In: Insurance: Mathe- matics and Economics 44 (Apr. 2009), pp. 182–198

  2. [2]

    Towards Principled Methods for Training Generative Adversarial Networks

    Martín Arjovsky and Léon Bottou. “Towards Principled Methods for Training Generative Adversarial Networks”. In: CoRR abs/1701.04862 (2017)

  3. [3]

    Size-Noise Tradeoffs in Generative Networks

    Bolton Bailey and Matus J Telgarsky. “Size-Noise Tradeoffs in Generative Networks”. In: Advances in Neural Information Processing Systems 31 . Ed. by S. Bengio et al. Curran Associates, Inc., 2018, pp. 6490–6500

  4. [4]

    H. Bauer. Wahrscheinlichkeitstheorie. De-Gruyter-Lehrbuch. de Gruyter, 2002. ISBN : 9783110172362

  5. [5]

    Vines - A new graphical model for dependent random variables

    Tim Bedford and Roger Cooke. “Vines - A new graphical model for dependent random variables”. In: Annals of Statistics 30 (Sept. 1999)

  6. [6]

    Large Scale GAN Training for High Fidelity Natural Image Synthesis

    Andrew Brock, Jeff Donahue, and Karen Simonyan. “Large Scale GAN Training for High Fidelity Natural Image Synthesis”. In: CoRR abs/1809.11096 (2018)

  7. [7]

    Extreme value theory: an introduction

    Laurens De Haan and Ana Ferreira. Extreme value theory: an introduction. Springer Science & Business Media, 2007

  8. [8]

    NICE: Non-linear Independent Compo- nents Estimation

    Laurent Dinh, David Krueger, and Yoshua Bengio. “NICE: Non-linear Independent Compo- nents Estimation”. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Workshop Track Proceedings. 2015

  9. [9]

    Density estimation using Real NVP

    Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. “Density estimation using Real NVP”. In: CoRR abs/1605.08803 (2016)

  10. [10]

    Copula Bayesian Networks

    Gal Elidan. “Copula Bayesian Networks”. In: Advances in Neural Information Processing Systems 23. Ed. by J. D. Lafferty et al. Curran Associates, Inc., 2010, pp. 559–567

  11. [11]

    Practical Extreme Value Modelling of Hydrological Floods and Droughts: A Case Study

    Kolbjørn Engeland, Hege Hisdal, and Arnoldo Frigessi. “Practical Extreme Value Modelling of Hydrological Floods and Droughts: A Case Study”. In: Extremes 7 (Mar. 2004), pp. 5–30

  12. [12]

    Maxout Networks

    Ian J. Goodfellow et al. “Maxout Networks”. In: Proceedings of the 30th International Confer- ence on International Conference on Machine Learning - Volume 28. ICML’13. Atlanta, GA, USA: JMLR.org, 2013, pp. III-1319–III-1327

  13. [13]

    Generative Adversarial Nets

    Ian Goodfellow et al. “Generative Adversarial Nets”. In: Advances in Neural Information Processing Systems 27. Ed. by Z. Ghahramani et al. Curran Associates, Inc., 2014, pp. 2672– 2680

  14. [14]

    Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

    Kaiming He et al. “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification”. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV). ICCV ’15. Washington, DC, USA: IEEE Computer Society, 2015, pp. 1026–1034. ISBN : 978-1-4673-8391-2

  15. [15]

    Approximation capabilities of multilayer feedforward networks

    Kurt Hornik. “Approximation capabilities of multilayer feedforward networks”. In: Neural Networks 4.2 (1991), pp. 251–257. ISSN : 0893-6080

  16. [16]

    Neural Autoregressive Flows

    Chin-Wei Huang et al. “Neural Autoregressive Flows”. In: CoRR abs/1804.00779 (2018)

  17. [17]

    Monte Carlo Methods and Models in Finance and Insurance

    Ralf Korn, Elke Korn, and Gerald Kroisandt. “Monte Carlo Methods and Models in Finance and Insurance”. In: (Jan. 2010)

  18. [18]

    Introduction to Vine Copulas

    Nicole Krämer and Ulf Schepsmeier. Introduction to Vine Copulas. 2011

  19. [19]

    Efficient BackProp

    Yann LeCun et al. “Efficient BackProp”. In: Neural Networks: Tricks of the Trade, This Book is an Outgrowth of a 1996 NIPS Workshop. London, UK, UK: Springer-Verlag, 1998, pp. 9–50. ISBN : 3-540-65311-2. 9

  20. [20]

    Which Training Methods for GANs do actually Converge?

    Lars M. Mescheder. “On the convergence properties of GAN training”. In: CoRR abs/1801.04406 (2018)

  21. [21]

    Rectified Linear Units Improve Restricted Boltzmann Machines

    Vinod Nair and Geoffrey E. Hinton. “Rectified Linear Units Improve Restricted Boltzmann Machines”. In: Proceedings of the 27th International Conference on International Conference on Machine Learning. ICML’10. Haifa, Israel: Omnipress, 2010, pp. 807–814.ISBN : 978-1- 60558-907-7

  22. [22]

    The Double Pareto-Lognormal Distribution—A New Parametric Model for Size Distributions

    William Reed and Murray Jorgensen. “The Double Pareto-Lognormal Distribution—A New Parametric Model for Size Distributions”. In: Communications in Statistics. Theory and Methods 8 (May 2004)

  23. [23]

    Variational Inference with Normalizing Flows

    Danilo Jimenez Rezende and Shakir Mohamed. “Variational Inference with Normalizing Flows”. In: Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37. ICML’15. Lille, France: JMLR.org, 2015, pp. 1530–1538

  24. [24]

    Survival Probabilities Based on Pareto Claim Distributions

    Hilary L. Seal. “Survival Probabilities Based on Pareto Claim Distributions”. In: ASTIN Bulletin 11.1 (1980), pp. 61–71. 10 A Proofs Proof of Lemma 6. P  a d0∑ j=1 Zj +b>x   = 1− P  a d0∑ j=1 Zj≤x−b   ≤ 1− P   d0⋂ j=1 { aZj≤ x−b d0 }  = P   d0⋃ j=1 { aZj > x−b d0 }  ≤d0 P ( aZ 1 > x−b d0 ) =d0 P (ad 0Z1 +b>x ). Proof of Corollary 7. SinceX...