Heavy Tailed Homogeneous Structural Causal Models

Shuyang Bai; Vishal Routh

arxiv: 2604.04118 · v1 · submitted 2026-04-05 · 🧮 math.ST · stat.TH

Heavy Tailed Homogeneous Structural Causal Models

Vishal Routh , Shuyang Bai This is my paper

Pith reviewed 2026-05-13 17:09 UTC · model grok-4.3

classification 🧮 math.ST stat.TH

keywords causal discoveryheavy-tailed modelsstructural causal modelsancestral ordertail coefficientsdirected acyclic graphsimpulse responses

0 comments

The pith

Causal tail coefficients identify the complete ancestral partial order of the directed acyclic graph in heavy-tailed homogeneous structural causal models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces the heavy-tailed homogeneous structural causal model as a framework that unifies linear and max-linear models driven by heavy-tailed noise. It establishes that coefficients extracted from the upper tails of the observed distributions suffice to recover every ancestral relationship encoded in the underlying directed acyclic graph. A recursive procedure is then given that converts those same coefficients into ancestral impulse-response quantities associated with the model. The results supply a theoretically justified route to causal discovery when extreme events dominate the data rather than typical observations.

Core claim

In the heavy-tailed homogeneous structural causal model, the causal tail coefficients determine the complete ancestral partial order on the variables of the underlying directed acyclic graph, and these coefficients can be used in a recursive procedure to recover the ancestral impulse-responses associated with the model.

What carries the argument

The causal tail coefficient, a quantity computed from the joint tail behavior of pairs of variables that encodes whether one is an ancestor of the other in the causal graph.

If this is right

Causal discovery can proceed using only tail dependence information rather than the entire joint distribution.
The ancestral partial order is recoverable without restricting to light-tailed noise or to specific parametric families beyond the heavy-tail condition.
Ancestral impulse-response quantities can be obtained recursively once the tail coefficients are known.
Both linear and max-linear heavy-tailed models arise as special cases covered by the same identification result.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Datasets dominated by rare but extreme events, such as financial crashes or geophysical extremes, become candidates for this style of causal analysis.
If the tail coefficients succeed on correctly specified models, they may still provide useful partial information when the data only approximately follow the heavy-tailed homogeneous structural causal model.
The recursive recovery step could be adapted to estimate the strength of ancestral influences once the order is known.

Load-bearing premise

The observed data must be generated exactly from a heavy-tailed homogeneous structural causal model whose noise variables meet the required heavy-tail regularity conditions and whose graph is a directed acyclic graph.

What would settle it

Simulate data from a known directed acyclic graph with heavy-tailed noises satisfying the regularity conditions, compute the causal tail coefficients, and check whether they exactly reproduce the true ancestral partial order; any pair that is not an ancestor yet yields a positive coefficient, or any true ancestor that yields a zero coefficient, would falsify the identification claim.

read the original abstract

We consider causal discovery in structural causal models driven by heavy-tailed noise, where extremes carry important information about causal direction. We introduce the Heavy-Tailed Homogeneous Structural Causal Model (HT-HSCM), a unified framework that generalizes heavy-tailed linear and max-linear models. We demonstrate that causal tail coefficients identify the complete ancestral partial order of the underlying directed acyclic graph. We also formulate a recursive algorithm for recovering quantities associated with the model called ancestral impulse-responses from the causal tail coefficients. Our results provide a general and theoretically justified framework for causal discovery in heavy-tailed systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The paper introduces the Heavy-Tailed Homogeneous Structural Causal Model (HT-HSCM) as a unifying framework for structural causal models with heavy-tailed noise, generalizing linear and max-linear cases. It claims that causal tail coefficients identify the complete ancestral partial order of the underlying DAG and presents a recursive algorithm to recover ancestral impulse-responses from these coefficients.

Significance. If the identification result holds under the model's homogeneity and heavy-tail regularity conditions (regular variation with matching tail indices), the work supplies a theoretically justified route to causal discovery that exploits tail behavior rather than relying on Gaussian or light-tailed assumptions. This could strengthen causal inference in domains with extreme-value data such as finance or environmental science, provided the mapping from DAG to tail-coefficient matrix is shown to be injective on the ancestral relation.

major comments (1)

The central identification claim (that causal tail coefficients recover the full ancestral partial order) is load-bearing for the paper's contribution; the manuscript must explicitly derive the injectivity of the map from DAG to the matrix of tail coefficients under the stated regularity conditions on the noise, including any required matching of tail indices across variables.

minor comments (3)

Provide a concise definition of the HT-HSCM, homogeneity property, and causal tail coefficient at the start of the technical development, with explicit notation for the tail-index vector and the recursive recovery procedure.
Include a short pseudocode or numbered steps for the recursive algorithm recovering ancestral impulse-responses, together with a statement of its computational complexity.
Add a brief discussion or small simulation check confirming that the identification fails gracefully when the heavy-tail regularity conditions are mildly violated.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and the recommendation of minor revision. We address the single major comment below.

read point-by-point responses

Referee: The central identification claim (that causal tail coefficients recover the full ancestral partial order) is load-bearing for the paper's contribution; the manuscript must explicitly derive the injectivity of the map from DAG to the matrix of tail coefficients under the stated regularity conditions on the noise, including any required matching of tail indices across variables.

Authors: We agree that an explicit derivation of injectivity strengthens the presentation. While Theorems 3.1--3.3 and the recursive recovery algorithm in Section 4 already establish that the causal tail coefficients uniquely determine the ancestral partial order under the HT-HSCM homogeneity and regular-variation assumptions, the injectivity direction is not isolated as a standalone statement. In the revised manuscript we will add a new corollary immediately after Theorem 3.3 that proves: if two DAGs G and G' produce identical matrices of causal tail coefficients, then they induce the same ancestral relation. The proof proceeds by contraposition on the recursive definition of the tail coefficients, using the fact that a missing ancestral edge forces a zero entry that cannot be reproduced by any other configuration when all noise variables share the same tail index (as required by Assumption 2.3). The necessity of matching tail indices will be stated explicitly in the corollary and cross-referenced to the regularity conditions. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The derivation establishes that causal tail coefficients identify the ancestral partial order in the HT-HSCM by leveraging the model's homogeneity, the DAG structure, and external heavy-tail regularity conditions (regular variation with matching tail indices) on the noise terms. These assumptions are stated as model primitives independent of the target identification result, and the recursive recovery of ancestral impulse-responses is presented as a direct consequence of the coefficient matrix without redefinition or fitting that collapses back to the inputs. No self-citation chains, ansatz smuggling, or renaming of known results appear load-bearing; the mapping is injective under the stated conditions rather than by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Only the abstract is available, so the ledger is necessarily incomplete; the central claim rests on domain assumptions about heavy-tailed homogeneous SCMs that are not enumerated here.

axioms (1)

domain assumption Noise variables are heavy-tailed and the structural equations are homogeneous.
Stated as the setting in which causal tail coefficients carry directional information.

invented entities (1)

HT-HSCM no independent evidence
purpose: Unified model class generalizing heavy-tailed linear and max-linear SCMs
Newly introduced framework whose properties enable the tail-coefficient results.

pith-pipeline@v0.9.0 · 5377 in / 1185 out tokens · 39136 ms · 2026-05-13T17:09:20.660127+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

5 extracted references · 5 canonical work pages

[1]

Adams, M., Ferry, K., and Yoshida, R. (2025). Inference for max-linear bayesian networks with noise.arXiv preprint arXiv:2505.00229. Am´ endola, C., Kl¨ uppelberg, C., Lauritzen, S., and Tran, N. M. (2022). Conditional independence in max- linear bayesian networks.The Annals of Applied Probability, 32(1):1–45. Bai, S., Fang, F., and Wang, T. (2025). Struc...

work page arXiv 2025
[2]

We start with the preparation of a few lemmas. Lemma 11.A non-negative random variableϵis regularly varying with indexα >0if and only if asx→ ∞, P(x−1ϵ∈ ·) P(ϵ > x) v − →να(·) whereν α is a Borel measure on(0,∞)given byν α(A) = R A αt−α−1 dtfor any Borel setA⊂(0,∞), and v − → denotes vague convergence with respect to the boundedness consisting of Borel su...

work page 2020
[3]

Then, asx→ ∞, P(x−1ϵ∈ ·) P(ϵ1 > x) v − →νϵ(·) = dX i=1 δ0 × · · · ×ν α × · · · ×δ 0(·), where the vague convergence v − →is with respect to the boundedness consisting of Borel subsets of[0,∞) d \ {0} that is each separated from the the origin0= (0, . . . ,0)∈R d. Proof.See Proposition 2.1.8 in Kulik and Soulier (2020). Lemma 13.Consider a HT-HSCMX={X 1,· ...

work page 2020
[4]

Proof.Our proof is inspired from Proposition 1 in Bai et al. (2025). By the 1-homogeneity ofF An(1) in Definition 6, we have P(X1 > x) =P FAn(1)(ϵAn(1))> x =P x−1ϵAn(1) ∈F −1 An(1)(1,∞) . Without loss of generality, assumeϵ An(1) = (ϵ 1, . . . , ϵ|An(1)|)⊤. Recall that for a continuous map, the boundary of the preimage of a set is contained in the preimag...

work page 2025
[5]

Proof of Lemma 6.Our strategy of proof is similar to that of the proof of Lemma 1 in Gnecco et al. (2021). We begin by noting the conditional expectation can be written as: E[G2(X2)|X 1 > x] = E[G2(X2)1{X1>x}] P(X1 > x) In view of the relation (13), we can decompose the indicator function as: 1{X1>x} =1 { W h∈An(1) Fh1ϵh>x} +1 {X1>x, W h∈An(1) Fh1ϵh≤x}. S...

work page 2021

[1] [1]

Adams, M., Ferry, K., and Yoshida, R. (2025). Inference for max-linear bayesian networks with noise.arXiv preprint arXiv:2505.00229. Am´ endola, C., Kl¨ uppelberg, C., Lauritzen, S., and Tran, N. M. (2022). Conditional independence in max- linear bayesian networks.The Annals of Applied Probability, 32(1):1–45. Bai, S., Fang, F., and Wang, T. (2025). Struc...

work page arXiv 2025

[2] [2]

We start with the preparation of a few lemmas. Lemma 11.A non-negative random variableϵis regularly varying with indexα >0if and only if asx→ ∞, P(x−1ϵ∈ ·) P(ϵ > x) v − →να(·) whereν α is a Borel measure on(0,∞)given byν α(A) = R A αt−α−1 dtfor any Borel setA⊂(0,∞), and v − → denotes vague convergence with respect to the boundedness consisting of Borel su...

work page 2020

[3] [3]

Then, asx→ ∞, P(x−1ϵ∈ ·) P(ϵ1 > x) v − →νϵ(·) = dX i=1 δ0 × · · · ×ν α × · · · ×δ 0(·), where the vague convergence v − →is with respect to the boundedness consisting of Borel subsets of[0,∞) d \ {0} that is each separated from the the origin0= (0, . . . ,0)∈R d. Proof.See Proposition 2.1.8 in Kulik and Soulier (2020). Lemma 13.Consider a HT-HSCMX={X 1,· ...

work page 2020

[4] [4]

Proof.Our proof is inspired from Proposition 1 in Bai et al. (2025). By the 1-homogeneity ofF An(1) in Definition 6, we have P(X1 > x) =P FAn(1)(ϵAn(1))> x =P x−1ϵAn(1) ∈F −1 An(1)(1,∞) . Without loss of generality, assumeϵ An(1) = (ϵ 1, . . . , ϵ|An(1)|)⊤. Recall that for a continuous map, the boundary of the preimage of a set is contained in the preimag...

work page 2025

[5] [5]

Proof of Lemma 6.Our strategy of proof is similar to that of the proof of Lemma 1 in Gnecco et al. (2021). We begin by noting the conditional expectation can be written as: E[G2(X2)|X 1 > x] = E[G2(X2)1{X1>x}] P(X1 > x) In view of the relation (13), we can decompose the indicator function as: 1{X1>x} =1 { W h∈An(1) Fh1ϵh>x} +1 {X1>x, W h∈An(1) Fh1ϵh≤x}. S...

work page 2021