{"paper":{"title":"A Global Characterization of $f$-Divergences Yielding PSD Mutual-Information Matrices","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Mutual-information matrices from f-divergences are positive semidefinite for all finite alphabets precisely when the normalized generator expands as a power series with nonnegative coefficients that converges on all positive reals.","cross_cats":["math.IT"],"primary_cat":"cs.IT","authors_text":"Zachary Robertson","submitted_at":"2026-01-13T19:09:19Z","abstract_excerpt":"Given $n$ random variables, when does the matrix of pairwise $f$-mutual informations define a PSD kernel over variables? For convex finite generators $f:(0,\\infty)\\to\\mathbb{R}$ with $f(1)=0$ and finite boundary value $f(0)$, we give a closed characterization up to linear transformation $f\\sim f+c(t-1)$, which leaves every $f$-divergence and every $f$-mutual-information matrix unchanged. The matrix $M^{(f)}_{ij}:=I_f(X_i;X_j)$ is PSD for every finite-alphabet family if and only if the normalized representative has a globally convergent expansion $\\bar f(t)=\\sum_{m\\ge2}a_m(t-1)^m$, with $a_m\\ge"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"The matrix M^{(f)}_{ij}:=I_f(X_i;X_j) is PSD for every finite-alphabet family if and only if the normalized representative has a globally convergent expansion bar f(t)=sum_{m>=2} a_m (t-1)^m, with a_m >=0, on all of (0,infty).","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the local positivity condition at t=1, extracted via biased three-point kernels and the BGKP theorem, extends to global analyticity and holds for all finite alphabets without additional restrictions on the divergence.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Pairwise f-mutual information matrices are positive semi-definite for all finite-alphabet distributions exactly when the f generator has a power series with all nonnegative coefficients that converges on the positive reals.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Mutual-information matrices from f-divergences are positive semidefinite for all finite alphabets precisely when the normalized generator expands as a power series with nonnegative coefficients that converges on all positive reals.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"36331d8bbc5663bf801ff50a529b67cdf3a5bfe52d44b6c1ea131f258c778c73"},"source":{"id":"2601.08929","kind":"arxiv","version":3},"verdict":{"id":"995b01ac-db5b-467b-9b40-ee43b482cacd","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T14:19:49.410149Z","strongest_claim":"The matrix M^{(f)}_{ij}:=I_f(X_i;X_j) is PSD for every finite-alphabet family if and only if the normalized representative has a globally convergent expansion bar f(t)=sum_{m>=2} a_m (t-1)^m, with a_m >=0, on all of (0,infty).","one_line_summary":"Pairwise f-mutual information matrices are positive semi-definite for all finite-alphabet distributions exactly when the f generator has a power series with all nonnegative coefficients that converges on the positive reals.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the local positivity condition at t=1, extracted via biased three-point kernels and the BGKP theorem, extends to global analyticity and holds for all finite alphabets without additional restrictions on the divergence.","pith_extraction_headline":"Mutual-information matrices from f-divergences are positive semidefinite for all finite alphabets precisely when the normalized generator expands as a power series with nonnegative coefficients that converges on all positive reals."},"references":{"count":15,"sample":[{"doi":"","year":2002,"title":"The mutual information: detecting and evaluating dependencies between variables,","work_id":"fea514a5-c474-40a4-8191-09cb088b6324","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2011,"title":"Detecting novel associations in large data sets,","work_id":"2a29cb7f-f51c-4e07-9aaa-8f1c8b20f120","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2015,"title":"G. Ver Steeg and A. Galstyan, “The information sieve,” inInternational Conference on Machine Learning. PMLR, 2015, pp. 164–172","work_id":"4e6194ae-cd5c-40d0-a24d-0c71dbf85498","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2024,"title":"How transformers learn causal structure with gradient descent","work_id":"15b3610a-8bd8-4edd-9296-a4ad1ee0cbaf","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2014,"title":"Mutual information matrices are not always positive semidefinite,","work_id":"e1cede26-bd45-4e22-9829-14606a05006e","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":15,"snapshot_sha256":"bde82dbe5d38aa658d2913bdc2b1bcb600aa6d2d561b5adcc8088e589eb7196c","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"5f66a16c9e181e8738c401ff9bc9cc1d217565dcb9d3d95330702f1659fe1abc"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}