pith. sign in

arxiv: 2606.18997 · v1 · pith:FGELP2AFnew · submitted 2026-06-17 · 💻 cs.LG

DIPHINE: Diffusion-based Φ-ID Neural Estimator

Pith reviewed 2026-06-26 21:26 UTC · model grok-4.3

classification 💻 cs.LG
keywords ΦIDmutual information estimationscore-based diffusion modelsMöbius inversioninformation decompositioncontinuous dynamical systemsneural estimatorsnon-Gaussian data
0
0 comments X

The pith

DIPHINE uses one score-based diffusion model to estimate all mutual information terms for ΦID and recovers the sixteen atoms via Möbius inversion.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces DIPHINE to enable Integrated Information Decomposition for continuous non-Gaussian multivariate dynamical systems. ΦID decomposes information dynamics into sixteen atoms that separate redundant, unique, and synergistic modes of storage, transfer, and integration. Prior methods were limited to Gaussian or discrete data, so the approach trains a single amortized diffusion network to produce the necessary mutual information estimates and applies Möbius inversion to obtain the atoms. Theoretical analysis tracks how estimation errors propagate through the inversion, proving that the Jacobian is integer-valued and that the synergy-to-synergy atom is the hardest to recover. Experiments show accurate recovery of ground-truth atoms on synthetic data, better performance than standard mutual-information estimators, and interpretable structure on real physiological recordings without any distributional assumptions.

Core claim

DIPHINE is the first neural estimator that leverages score-based diffusion models to jointly estimate all the mutual information terms required by ΦID from a single amortized network, recovering the sixteen atoms through Möbius inversion. It provides a theoretical analysis of error propagation through the inversion, showing that the Jacobian of the mapping from mutual informations to atoms is integer-valued and that the synergy-to-synergy atom is provably the hardest to estimate. The method demonstrates accurate recovery of ground-truth atoms on synthetic benchmarks, superior performance compared to established mutual information estimators, and the ability to extract physiologically interpr

What carries the argument

A single score-based diffusion model that jointly estimates the mutual information terms needed by ΦID, followed by Möbius inversion to recover the sixteen atoms.

If this is right

  • Accurate recovery of ground-truth atoms on synthetic benchmarks for continuous non-Gaussian data.
  • Superior performance relative to established mutual information estimators.
  • Extraction of physiologically interpretable information-dynamic structure from real data without distributional assumptions.
  • Error propagation through the inversion is governed by an integer-valued Jacobian.
  • The synergy-to-synergy atom is provably the hardest to estimate.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The amortized single-network design may scale to higher-dimensional systems where separate estimators would become prohibitive.
  • The integer Jacobian property could be used to design targeted regularization that protects the most sensitive atoms during training.
  • The same diffusion-plus-inversion pipeline might be applied to other information decompositions that rely on Möbius inversion over lattices.
  • Real-data applications could be extended by testing whether the extracted atoms remain stable under controlled perturbations of the observed time series.

Load-bearing premise

A single score-based diffusion model produces sufficiently accurate estimates of all required mutual-information terms for continuous non-Gaussian data so that Möbius inversion yields reliable atoms.

What would settle it

A synthetic continuous non-Gaussian process with independently computed ground-truth ΦID atoms on which DIPHINE's recovered atoms deviate beyond the error bounds predicted by the integer Jacobian analysis.

Figures

Figures reproduced from arXiv: 2606.18997 by Giulio Franzese, Maurizio Filippone, Mustapha Bounoua, Pietro Michiardi, Simon Pedro Galeano Munoz.

Figure 1
Figure 1. Figure 1: ΦID atom MAE (top) and standard deviation (bottom) for d = 3 (left) and d = 10 (right) at n = 100,000. DIPHINE KSG MINE InfoNCE NWJ 0.0 0.2 0.4 0.6 0.8 1.0 MI MAE MI MAE by Estimator and Dimension Dimension d=1 d=3 d=5 d=10 DIPHINE KSG MINE InfoNCE NWJ 0.0 0.1 0.2 0.3 0.4 0.5 Atom MAE Atom MAE by Estimator and Dimension Dimension d=1 d=3 d=5 d=10 [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: MI MAE (left) and atom MAE (right) by estimator and dimension, averaged over all [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Per-atom MAE by estimator, averaged across all systems and dimensions. [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: TE decomposition into ΦID atoms. Syn→Red domi￾nates in both directions. ΦID decomposition [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The redundancy lattice for two sources. Each node is one of the four antichains in Eq. (14). [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The double-redundancy lattice for two source and two target variables. Each node is one of [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Estimated MI values versus analytic ground truth for three Gaussian VAR(1) systems at [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: ΦID atom MAE (top row) and standard deviation (bottom row) for the three d = 1 systems at n = 100,000. The Syn→Syn atom consistently exhibits the largest MAE, as predicted by the error propagation analysis in § 4.5. The Red→Red atom is the most accurately estimated across all systems. D.4 MI bar charts (d = 3, 5, 10) 1 1 1 2 1 12 2 1 2 2 2 12 12 1 12 2 12 12 0.0 0.5 1.0 1.5 MI (nats) MI MAE=0.0179 Sparse C… view at source ↗
Figure 10
Figure 10. Figure 10: Estimated MI values versus ground truth for [PITH_FULL_IMAGE:figures/full_fig_p018_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Estimated MI values versus ground truth for [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Estimated MI values versus ground truth for [PITH_FULL_IMAGE:figures/full_fig_p019_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: ΦID atom MAE and standard deviation for d = 5 at n = 100,000. The error pattern is consistent with d = 1 and d = 3: atoms involving synergy on both sides accumulate the largest errors. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: MI MAE (left) and atom MAE (right) under identity, half-cube, and CDF transforms [PITH_FULL_IMAGE:figures/full_fig_p021_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: MI MAE (left) and atom MAE (right) as a function of sample size for the three [PITH_FULL_IMAGE:figures/full_fig_p022_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: DIPHINE MI MAE (left) and atom MAE (right) for [PITH_FULL_IMAGE:figures/full_fig_p023_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: All methods at d = 3, n = 100,000. InfoNCE and NWJ remain competitive with DIPHINE; KSG degrades substantially. 0.0 0.1 0.2 0.3 0.4 Coupling strength c 0.00 0.05 0.10 0.15 0.20 0.25 0.30 MI MAE MI MAE vs Coupling Strength (d=5, n=100K) DIPHINE MINE NWJ InfoNCE KSG 0.0 0.1 0.2 0.3 0.4 Coupling strength c 0.01 0.02 0.03 0.04 0.05 0.06 0.07 Atom MAE Atom MAE vs Coupling Strength (d=5, n=100K) DIPHINE MINE NW… view at source ↗
Figure 18
Figure 18. Figure 18: All methods at d = 5, n = 100,000. InfoNCE achieves lower MI MAE than DIPHINE, but DIPHINE pulls ahead on atom MAE. KSG deteriorates further. 23 [PITH_FULL_IMAGE:figures/full_fig_p023_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: All methods at d = 10, n = 100,000. DIPHINE achieves the lowest atom MAE; MINE and KSG degrade catastrophically. 24 [PITH_FULL_IMAGE:figures/full_fig_p024_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: shows the complete 4 × 4 atom matrices for both directions and both age groups. The largest atoms are the self-storage terms Uni →Uni and the synergistic integration Syn→Syn. The overall structure is remarkably consistent between young and elderly subjects, with the primary difference being a quantitative increase in the self-storage atoms in the elderly group. Red Un 1 Un 2 Syn Future Red Un 1 Un 2 Syn P… view at source ↗
Figure 21
Figure 21. Figure 21: Mean cross-seed atom variance for each direction and age group in the Fantasia dataset. [PITH_FULL_IMAGE:figures/full_fig_p026_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Distribution of TE estimates across subjects for both directions in young (left) and elderly [PITH_FULL_IMAGE:figures/full_fig_p026_22.png] view at source ↗
read the original abstract

Uncovering the true informational architecture of real-world complex systems requires disentangling how their components uniquely store, redundantly share, and synergistically integrate information over time. Integrated Information Decomposition ($\Phi$ID) is a framework for decomposing the information dynamics of multivariate systems into sixteen non-overlapping atoms that characterize redundant, unique, and synergistic modes of information storage, transfer, and integration. Existing methods to compute $\Phi$ID are restricted to Gaussian or discrete systems, preventing its application to continuous non-Gaussian dynamical systems. We address this limitation by proposing DIPHINE (Diffusion-based $\Phi$-ID Neural Estimator), the first neural estimator that leverages score-based diffusion models to jointly estimate all the mutual information terms required by $\Phi$ID from a single amortized network, recovering the sixteen atoms through M\"obius inversion. We provide a theoretical analysis of error propagation through the inversion, showing that the Jacobian of the mapping from mutual informations to atoms is integer-valued and that the synergy-to-synergy atom is provably the hardest to estimate. We demonstrate accurate recovery of ground-truth atoms on synthetic benchmarks, superior performance compared to established mutual information estimators, and the ability to extract physiologically interpretable information-dynamic structure on an application involving real data without any distributional assumptions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes DIPHINE, the first neural estimator that uses a single amortized score-based diffusion model to jointly estimate all mutual-information terms required by ΦID, then recovers the sixteen atoms via Möbius inversion. It supplies a theoretical analysis of error propagation through the integer-valued Jacobian (highlighting the synergy-to-synergy atom as hardest) and reports accurate recovery on synthetic benchmarks, superior performance versus existing MI estimators, and interpretable results on real physiological data without distributional assumptions.

Significance. If the central claim holds, the work would remove the Gaussian/discrete restriction that has limited ΦID to date, enabling its use on continuous non-Gaussian dynamical systems; the amortized single-network design and explicit error-propagation analysis are genuine strengths that could make the method practically usable.

major comments (2)
  1. [Abstract / method description] The central claim that one diffusion-based MI estimator produces errors small enough and sufficiently uncorrelated for stable Möbius inversion to sixteen atoms rests on unverified experimental assertions; the abstract states accurate recovery and a theoretical analysis but supplies no quantitative error bars, dataset sizes, or ablation results on per-term MI accuracy (reader’s soundness note).
  2. [Theoretical analysis section] The theoretical error-propagation claim is stated but not shown in sufficient detail to confirm that the integer Jacobian does not amplify bias or variance into uninterpretable atoms for continuous non-Gaussian data; the paper identifies the synergy-to-synergy atom as hardest yet does not demonstrate that the diffusion estimator keeps its error below the threshold needed for reliable recovery.
minor comments (2)
  1. [Abstract] The abstract asserts “superior performance compared to established mutual information estimators” without naming the baselines, metrics, or dataset characteristics used for that comparison.
  2. [Introduction / method] Notation for the sixteen ΦID atoms and the precise mapping from the estimated MI terms to those atoms should be introduced earlier and with an explicit table or diagram to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will revise the manuscript accordingly to provide additional quantitative details and expanded theoretical exposition.

read point-by-point responses
  1. Referee: [Abstract / method description] The central claim that one diffusion-based MI estimator produces errors small enough and sufficiently uncorrelated for stable Möbius inversion to sixteen atoms rests on unverified experimental assertions; the abstract states accurate recovery and a theoretical analysis but supplies no quantitative error bars, dataset sizes, or ablation results on per-term MI accuracy (reader’s soundness note).

    Authors: We agree that the abstract would benefit from explicit quantitative support. In the revision we will augment the abstract with key numerical results (mean absolute errors with standard deviations across the 16 atoms, dataset sizes for the synthetic benchmarks, and a reference to the per-term MI ablation studies). These metrics and ablations already appear in Sections 4–5; the change is therefore limited to improved visibility in the abstract and method overview. revision: yes

  2. Referee: [Theoretical analysis section] The theoretical error-propagation claim is stated but not shown in sufficient detail to confirm that the integer Jacobian does not amplify bias or variance into uninterpretable atoms for continuous non-Gaussian data; the paper identifies the synergy-to-synergy atom as hardest yet does not demonstrate that the diffusion estimator keeps its error below the threshold needed for reliable recovery.

    Authors: We acknowledge that the current theoretical section derives the integer Jacobian and identifies the synergy-to-synergy atom via its maximal coefficients, but does not supply explicit numerical verification of error thresholds under the diffusion estimator’s bias/variance profile. We will expand the section with (i) a short derivation of worst-case amplification bounds and (ii) Monte-Carlo simulations that inject realistic diffusion-model errors into the MI vector and track atom recovery error, confirming that the observed per-term errors remain below the stability threshold for all sixteen atoms on continuous non-Gaussian data. revision: yes

Circularity Check

0 steps flagged

No circularity detected; derivation is self-contained new pipeline

full rationale

The paper presents DIPHINE as a novel combination of score-based diffusion models for amortized joint estimation of the mutual information terms in the ΦID decomposition, followed by standard Möbius inversion to recover the 16 atoms. The abstract and claims describe this as a new estimation pipeline with an independent theoretical analysis of error propagation (integer Jacobian, hardest atom identified). No equations or steps reduce any claimed result to a fitted parameter renamed as prediction, a self-citation chain, or a definitional tautology. The method is grounded in external techniques (diffusion models, Möbius inversion) without load-bearing self-referential reductions. This is the normal case of an independent methodological contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the method relies on standard properties of score-based diffusion models and Möbius inversion, which are treated as background.

axioms (1)
  • standard math Möbius inversion correctly recovers the 16 ΦID atoms from the estimated mutual-information terms
    Invoked when the paper states that the sixteen atoms are recovered through Möbius inversion.

pith-pipeline@v0.9.1-grok · 5769 in / 1264 out tokens · 26630 ms · 2026-06-26T21:26:02.762878+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

74 extracted references · 9 canonical work pages · 4 internal anchors

  1. [2]

    , journal=

    Barrett, Adam B. , journal=. Exploration of synergistic and redundant information sharing in static and dynamical. 2015 , publisher=

  2. [3]

    Mediano, Pedro A. M. and Rosas, Fernando E. and Luppi, Andrea I. and Carhart-Harris, Robin L. and Bor, Daniel and Seth, Anil K. and Barrett, Adam B. , journal=. Toward a unified taxonomy of information dynamics via. 2025 , publisher=

  3. [4]

    Entropy , volume=

    Quantifying unique information , author=. Entropy , volume=. 2014 , publisher=

  4. [5]

    Entropy , volume=

    Measuring multivariate redundant information with pointwise common change in surprisal , author=. Entropy , volume=. 2017 , publisher=

  5. [6]

    On extractable shared information

    Extractable shared information , author=. arXiv preprint arXiv:1701.07805 , year=

  6. [7]

    2012 , publisher=

    Enumerative Combinatorics , author=. 2012 , publisher=

  7. [8]

    , author=

    Estimation of non-normalized statistical models by score matching. , author=. Journal of Machine Learning Research , volume=

  8. [12]

    Neural computation , volume=

    A connection between score matching and denoising autoencoders , author=. Neural computation , volume=. 2011 , publisher=

  9. [13]

    2018 , organization=

    Belghazi, Mohamed Ishmael and Barber, Aristide and Drber, Stephan and Ozair, Sherjil and Pineau, Joelle and Courville, Aaron and Bengio, Yoshua , booktitle=. 2018 , organization=

  10. [14]

    IEEE Transactions on Information Theory , volume=

    Estimating divergence functionals and the likelihood ratio by convex risk minimization , author=. IEEE Transactions on Information Theory , volume=. 2010 , publisher=

  11. [16]

    Physical Review E , volume=

    Estimating mutual information , author=. Physical Review E , volume=. 2004 , publisher=

  12. [17]

    American Journal of Physiology -- Regulatory, Integrative and Comparative Physiology , volume=

    Age-related alterations in the fractal scaling of cardiac interbeat interval dynamics , author=. American Journal of Physiology -- Regulatory, Integrative and Comparative Physiology , volume=. 1996 , publisher=

  13. [18]

    and Amaral, Luis A

    Goldberger, Ary L. and Amaral, Luis A. N. and Glass, Leon and Hausdorff, Jeffrey M. and Ivanov, Plamen Ch. and Mark, Roger G. and Mietus, Joseph E. and Moody, George B. and Peng, Chung-Kang and Stanley, H. Eugene , journal=. 2000 , publisher=

  14. [19]

    Entropy , volume=

    Transfer information energy: A quantitative indicator of information transfer between time series , author=. Entropy , volume=. 2018 , publisher=

  15. [20]

    International Conference on Artificial Intelligence and Statistics , pages=

    Formal limitations on the measurement of mutual information , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2020 , organization=

  16. [21]

    Advances in Neural Information Processing Systems , volume=

    Estimating the unique information of continuous variables , author=. Advances in Neural Information Processing Systems , volume=

  17. [22]

    Partial Information Decomposition via Normalizing Flows in Latent

    Zhao, Fei and Duan, Shujian and Qu, Huimin , booktitle=. Partial Information Decomposition via Normalizing Flows in Latent

  18. [23]

    Advances in Neural Information Processing Systems , volume=

    Mutual Information Estimation via Normalizing Flows , author=. Advances in Neural Information Processing Systems , volume=

  19. [24]

    Kholkin, Sergei and Butakov, Ivan and Burnaev, Evgeny and Gushchin, Nikita and Korotin, Alexander , journal=

  20. [25]

    International Conference on Artificial Intelligence and Statistics , year=

    Galeano Mu. International Conference on Artificial Intelligence and Statistics , year=

  21. [26]

    and Rizzo, Maria L

    Székely, Gábor J. and Rizzo, Maria L. and Bakirov, Nail K. , title =. Annals of Statistics , year =

  22. [28]

    Measuring Statistical Dependence with Hilbert-Schmidt Norms , booktitle =

    Gretton, Arthur and Bousquet, Olivier and Smola, Alex and Sch. Measuring Statistical Dependence with Hilbert-Schmidt Norms , booktitle =. 2005 , pages =

  23. [29]

    Journal of computational neuroscience , volume=

    Transfer entropy—a model-free measure of effective connectivity for the neurosciences , author=. Journal of computational neuroscience , volume=. 2011 , publisher=

  24. [30]

    Journal of Hydrology , volume=

    A copula-based joint deficit index for droughts , author=. Journal of Hydrology , volume=. 2010 , publisher=

  25. [31]

    International economic review , volume=

    Modelling asymmetric exchange rate dependence , author=. International economic review , volume=. 2006 , publisher=

  26. [32]

    , title =

    Kullback, Solomon and Leibler, Richard A. , title =. The Annals of Mathematical Statistics , year =

  27. [33]

    Physical review letters , volume=

    Measuring information transfer , author=. Physical review letters , volume=. 2000 , publisher=

  28. [34]

    Information Sciences , volume =

    Local measures of information storage in complex distributed computation , author =. Information Sciences , volume =. 2012 , publisher =

  29. [35]

    , title =

    Seth, Anil K. , title =. Network: Computation in Neural Systems , volume =. 2005 , publisher =

  30. [36]

    Proceedings of the National Academy of Sciences , volume=

    Unified framework for information integration based on information geometry , author=. Proceedings of the National Academy of Sciences , volume=. 2016 , publisher=

  31. [37]

    2023 , cdate=

    Xianghao Kong and Rob Brekelmans and Greg Ver Steeg , title=. 2023 , cdate=

  32. [38]

    Trends in Cognitive Sciences , volume=

    Information decomposition and the informational architecture of the brain , author=. Trends in Cognitive Sciences , volume=. 2024 , publisher=

  33. [39]

    and Mediano, Pedro A

    Luppi, Andrea I. and Mediano, Pedro A. M. and Rosas, Fernando E. and Allanson, Judith and Pickard, John D. and Carhart-Harris, Robin L. and Williams, Guy B. and Craig, Michael M. and Finoia, Paola and Owen, Adrian M. and Naci, Lorina and Menon, David K. and Bor, Daniel and Stamatakis, Emmanuel A. , journal=. A synergistic workspace for human consciousness...

  34. [40]

    Information Sciences , volume=

    Local measures of information storage in complex distributed computation , author=. Information Sciences , volume=. 2012 , publisher=

  35. [41]

    Jansma, Abel and Mediano, Pedro AM and Rosas, Fernando E , journal=. Fast M. 2025 , publisher=

  36. [43]

    A. B. Barrett. Exploration of synergistic and redundant information sharing in static and dynamical G aussian systems. Physical Review E, 91 0 (5): 0 052802, 2015

  37. [44]

    M. I. Belghazi, A. Barber, S. Drber, S. Ozair, J. Pineau, A. Courville, and Y. Bengio. MINE : Mutual information neural estimation. In International Conference on Machine Learning, pages 531--540. PMLR, 2018

  38. [45]

    Bertschinger, J

    N. Bertschinger, J. Rauh, E. Olbrich, J. Jost, and N. Ay. Quantifying unique information. Entropy, 16 0 (4): 0 2161--2183, 2014

  39. [46]

    Bounoua, G

    M. Bounoua, G. Franzese, and P. Michiardi. S omega i: Score-based o-information estimation. arXiv preprint arXiv:2402.05667, 2024

  40. [47]

    Butakov, A

    I. Butakov, A. Tolmachev, S. Malanchuk, A. Neopryatnaya, and A. Frolov. Mutual information estimation via normalizing flows. In Advances in Neural Information Processing Systems, volume 37, pages 3027--3057, 2024

  41. [48]

    Ca t aron and R

    A. Ca t aron and R. Andonie. Transfer information energy: A quantitative indicator of information transfer between time series. Entropy, 20 0 (5): 0 323, 2018

  42. [49]

    Franzese, M

    G. Franzese, M. Bounoua, and P. Michiardi. Minde: Mutual information neural diffusion estimation. arXiv preprint arXiv:2310.09031, 2023

  43. [50]

    S. P. Galeano Mu \ n oz, M. Bounoua, G. Franzese, P. Michiardi, and M. Filippone. TENDE : Transfer entropy neural diffusion estimation. In International Conference on Artificial Intelligence and Statistics. PMLR, 2026

  44. [51]

    A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley. PhysioBank, PhysioToolkit, and PhysioNet : Components of a new research resource for complex physiologic signals. Circulation, 101 0 (23): 0 e215--e220, 2000

  45. [52]

    Gretton, O

    A. Gretton, O. Bousquet, A. Smola, and B. Sch \"o lkopf. Measuring statistical dependence with hilbert-schmidt norms. In Algorithmic Learning Theory, pages 63--77. Springer Berlin Heidelberg, 2005

  46. [53]

    Hyv \"a rinen

    A. Hyv \"a rinen. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6 0 (4), 2005

  47. [54]

    R. A. A. Ince. Measuring multivariate redundant information with pointwise common change in surprisal. Entropy, 19 0 (7): 0 318, 2017

  48. [55]

    Iyengar, C.-K

    N. Iyengar, C.-K. Peng, R. Morin, A. L. Goldberger, and L. A. Lipsitz. Age-related alterations in the fractal scaling of cardiac interbeat interval dynamics. American Journal of Physiology -- Regulatory, Integrative and Comparative Physiology, 271 0 (4): 0 R1078--R1084, 1996

  49. [56]

    Kao and R

    S.-C. Kao and R. S. Govindaraju. A copula-based joint deficit index for droughts. Journal of Hydrology, 380 0 (1-2): 0 121--134, 2010

  50. [57]

    Kholkin, I

    S. Kholkin, I. Butakov, E. Burnaev, N. Gushchin, and A. Korotin. InfoBridge : Mutual information estimation via bridge matching. arXiv preprint arXiv:2502.01383, 2025

  51. [58]

    X. Kong, R. Brekelmans, and G. V. Steeg. Information-theoretic diffusion. In ICLR, 2023

  52. [59]

    Kraskov, H

    A. Kraskov, H. St \"o gbauer, and P. Grassberger. Estimating mutual information. Physical Review E, 69 0 (6): 0 066138, 2004

  53. [60]

    Kullback and R

    S. Kullback and R. A. Leibler. On information and sufficiency. The Annals of Mathematical Statistics, 22 0 (1): 0 79--86, 1951

  54. [61]

    Z. Liu, M. Barahona, and R. L. Peach. Information-theoretic measures on lattices for high-order interactions. arXiv preprint arXiv:2408.07533, 2024

  55. [62]

    J. T. Lizier, M. Prokopenko, and A. Y. Zomaya. Local measures of information storage in complex distributed computation. Information Sciences, 208: 0 39--54, 2012

  56. [63]

    A. I. Luppi, P. A. M. Mediano, F. E. Rosas, J. Allanson, J. D. Pickard, R. L. Carhart-Harris, G. B. Williams, M. M. Craig, P. Finoia, A. M. Owen, L. Naci, D. K. Menon, D. Bor, and E. A. Stamatakis. A synergistic workspace for human consciousness revealed by I ntegrated I nformation D ecomposition. eLife, 12: 0 e88173, 2024 a

  57. [64]

    A. I. Luppi, F. E. Rosas, P. A. M. Mediano, D. K. Menon, and E. A. Stamatakis. Information decomposition and the informational architecture of the brain. Trends in Cognitive Sciences, 28 0 (4): 0 352--368, 2024 b

  58. [65]

    McAllester and K

    D. McAllester and K. Stratos. Formal limitations on the measurement of mutual information. In International Conference on Artificial Intelligence and Statistics, pages 875--884. PMLR, 2020

  59. [66]

    P. A. M. Mediano, F. E. Rosas, A. I. Luppi, R. L. Carhart-Harris, D. Bor, A. K. Seth, and A. B. Barrett. Toward a unified taxonomy of information dynamics via I ntegrated I nformation D ecomposition. Proceedings of the National Academy of Sciences, 122 0 (39): 0 e2423297122, 2025

  60. [67]

    Nguyen, M

    X. Nguyen, M. J. Wainwright, and M. I. Jordan. Estimating divergence functionals and the likelihood ratio by convex risk minimization. IEEE Transactions on Information Theory, 56 0 (11): 0 5847--5861, 2010

  61. [68]

    Oizumi, N

    M. Oizumi, N. Tsuchiya, and S.-i. Amari. Unified framework for information integration based on information geometry. Proceedings of the National Academy of Sciences, 113 0 (51): 0 14817--14822, 2016

  62. [69]

    A. v. d. Oord, Y. Li, and O. Vinyals. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018

  63. [70]

    Pakman, A

    A. Pakman, A. Nejatbakhsh, D. Gilboa, A. Makkeh, L. Mazzucato, M. Wibral, and E. Schneidman. Estimating the unique information of continuous variables. In Advances in Neural Information Processing Systems, volume 34, pages 20295--20307, 2021

  64. [71]

    A. J. Patton. Modelling asymmetric exchange rate dependence. International economic review, 47 0 (2): 0 527--556, 2006

  65. [72]

    Schreiber

    T. Schreiber. Measuring information transfer. Physical review letters, 85 0 (2): 0 461, 2000

  66. [73]

    A. K. Seth. Causal connectivity of evolved neural networks during behavior. Network: Computation in Neural Systems, 16 0 (1): 0 35--54, 2005

  67. [74]

    Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020

  68. [75]

    R. P. Stanley. Enumerative Combinatorics, volume 1. Cambridge University Press, 2 edition, 2012

  69. [76]

    G. J. Székely, M. L. Rizzo, and N. K. Bakirov. Measuring and testing dependence by correlation of distances. Annals of Statistics, 35 0 (6): 0 2769--2794, 2007

  70. [77]

    Vicente, M

    R. Vicente, M. Wibral, M. Lindner, and G. Pipa. Transfer entropy—a model-free measure of effective connectivity for the neurosciences. Journal of computational neuroscience, 30 0 (1): 0 45--67, 2011

  71. [78]

    P. Vincent. A connection between score matching and denoising autoencoders. Neural computation, 23 0 (7): 0 1661--1674, 2011

  72. [79]

    P. L. Williams and R. D. Beer. Nonnegative decomposition of multivariate information. arXiv preprint arXiv:1004.2515, 2010

  73. [80]

    S. Yu, F. Alesiani, X. Yu, R. Jenssen, and J. C. Principe. Measuring dependence with matrix-based entropy functional. arXiv preprint arXiv:2101.10160, 2021

  74. [81]

    F. Zhao, S. Duan, and H. Qu. Partial information decomposition via normalizing flows in latent G aussian distributions. In International Conference on Learning Representations, 2025