Geometry-Aware Langevin Sampling for Matrix-Valued Graph Learning

Papri Dey

arxiv: 2603.24913 · v2 · submitted 2026-03-26 · 🧮 math.OC · math.DG· math.DS· math.PR· math.ST· stat.TH

Geometry-Aware Langevin Sampling for Matrix-Valued Graph Learning

Papri Dey This is my paper

Pith reviewed 2026-05-15 01:07 UTC · model grok-4.3

classification 🧮 math.OC math.DGmath.DSmath.PRmath.STstat.TH

keywords Langevin samplingpositive semidefinite matricesgraph learninglog-determinant energyaffine-invariant metricMetropolis-adjusted LangevinRiemannian geometryblock Laplacian

0 comments

The pith

The Hessian of the log-determinant energy equals the pullback of the affine-invariant metric on positive definite matrices, supplying closed-form intrinsic Langevin proposals for matrix-valued graphs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Bayesian inference on positive semidefinite matrix parameters for graph models suffers from poor mixing when proposals ignore the cone boundary. The paper defines an energy as the negative log-determinant of a lifted precision matrix formed from block Laplacians plus a stabilizer. It proves that the Hessian of this energy is precisely the pullback of the standard affine-invariant Riemannian metric on the positive definite cone through the linear map from edge weights to the precision matrix. This identity produces explicit geometry-aware proposals that use the SPD exponential map together with its Jacobian in a Metropolis-adjusted Langevin step. Experiments on rank-one cases confirm the curvature formula, and tests on posterior sampling and matrix-valued Gaussians show higher effective samples per second than Euclidean or generic Riemannian baselines.

Core claim

For a PSD-weighted graph with edge kernels W_e positive semidefinite, the block Laplacian L(W) and positive definite stabilizer R define the lifted precision X(W) = L(W) + R in the space of positive definite matrices of size md. The energy is the negative log-determinant Φ(W) = −log det X(W). The Hessian of Φ at W is exactly the pullback of the affine-invariant SPD metric under the map W ↦ X(W). This Riemannian structure on the space of edge weights yields explicit intrinsic Langevin proposals that incorporate the closed-form Jacobian of the SPD exponential map and are corrected by a Metropolis-Hastings step.

What carries the argument

The pullback of the affine-invariant Riemannian metric on the positive definite cone under the linear map W to X(W) = L(W) + R, which equals the Hessian of the log-determinant energy and generates the geometry-aware Langevin proposals.

If this is right

Proposals are given in closed form without numerical approximation of the Hessian.
Multichain diagnostics remain stable in intrinsic SPD posterior sampling experiments.
Effective sample size per second exceeds that of Euclidean MALA and generic RMALA.
The method supplies a practical route to uncertainty quantification in PSD-constrained graph learning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same pullback construction could be applied to any energy formed by composing log-det with a linear map into the positive definite cone.
Scalability to very large graphs would require checking whether the closed-form Jacobian remains computationally tractable.
The approach links sampling in structured covariance models directly to the well-studied Riemannian geometry of SPD matrices.

Load-bearing premise

The geometry induced by the log-determinant energy produces proposals that remain effective and stable for graph sizes and matrix dimensions beyond the small rank-one cases tested.

What would settle it

Apply the sampler to graphs with matrix dimension d larger than 5 or substantially more nodes and check whether acceptance rates or effective samples per second fall below those of Euclidean MALA while finite-difference curvature checks still match the analytic formula.

read the original abstract

Bayesian inference over positive semidefinite (PSD) matrix-valued parameters arises in structured covariance estimation, graph-Laplacian precision models, and multi-output graph learning, but Euclidean proposals often mix poorly near the cone boundary. We propose \ConeMALA, a geometry-aware Metropolis-adjusted Langevin algorithm whose proposal geometry is induced by the model's log-determinant structure. For a PSD-weighted graph with edge kernels $W_e\succeq 0$, block Laplacian $L(W)$ , and stabilizer $R\succ 0$, the lifted precision matrix $X(W)=L(W)+R\in \mathbb S_{++}^{md}$ defines the log-determinant energy $\Phi(W)=-\log\det X(W).$ We show that the Hessian of $\Phi$ is the pullback of the affine-invariant SPD metric under the map $W\mapsto X(W)$, yielding explicit intrinsic Langevin proposals with Metropolis-Hastings correction using the closed-form SPD exponential-map Jacobian. We validate the metric on rank-one PSD edge perturbations for $d=5$, obtaining essentially exact agreement between analytic curvature scores and finite-difference curvatures. In intrinsic SPD posterior and matrix-valued graph Gaussian experiments, \ConeMALA achieves stable multichain diagnostics and substantially higher ESS/sec than Euclidean MALA and generic RMALA, while a PDHMC-like finite-difference baseline is accurate but computationally prohibitive at larger graph sizes. These results show that pullback log-determinant geometry provides a practical route to uncertainty quantification in PSD-constrained graph learning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ConeMALA gives explicit pullback proposals from the log-det energy on block Laplacians, but the Hessian identity is only verified on rank-one d=5 cases.

read the letter

The paper's main contribution is ConeMALA, a MALA variant whose proposal comes from pulling back the affine-invariant metric on SPD matrices through the linear map W to X(W) = L(W) + R, where the energy is -log det X(W). This yields closed-form Langevin steps and an explicit Jacobian for the Metropolis-Hastings correction. The derivation follows from the standard geometry of the SPD cone and the chain rule on the lifted precision matrix, so the algebra checks out on paper. In the reported experiments it produces higher ESS per second than Euclidean MALA or generic RMALA on intrinsic SPD posterior sampling and matrix-valued graph Gaussian tasks, while a finite-difference PDHMC baseline stays accurate but slows down at larger graphs. That practical speed-up is the clearest win here. The limitation is the narrow check on the central claim. The analytic curvature is compared to finite differences only for rank-one PSD edge perturbations at d=5, with essentially exact agreement reported. No results are given for full-rank W or d greater than 5, so it is not yet clear whether the pullback identity holds without extra terms that vanish only in the rank-one setting. The experiments also lack error bars and broader controls on graph size or conditioning. This work is aimed at people already using Riemannian MCMC for PSD-constrained covariance or graph-Laplacian models who want geometry that matches the log-det structure rather than a generic metric. It deserves a serious referee because the construction is explicit and the empirical gains are concrete, even if the validation of the geometry needs to be expanded before the method can be trusted for general use.

Referee Report

1 major / 2 minor

Summary. The paper proposes ConeMALA, a geometry-aware Metropolis-adjusted Langevin algorithm for Bayesian sampling over PSD matrix-valued graph parameters. The central claim is that for the log-determinant energy Φ(W) = −log det X(W) with X(W) = L(W) + R, the Hessian of Φ is exactly the pullback of the affine-invariant SPD metric under the linear map W ↦ X(W). This yields explicit intrinsic Langevin proposals together with a closed-form Jacobian of the SPD exponential map for the Metropolis-Hastings correction. The identity is checked by finite-difference agreement on rank-one PSD edge perturbations at d=5; experiments on intrinsic SPD posteriors and matrix-valued graph Gaussians report stable multichain diagnostics and higher ESS/sec than Euclidean MALA, RMALA, and a PDHMC-like baseline.

Significance. If the pullback identity holds for general full-rank W and d > 5, the construction supplies a parameter-free, geometry-aware proposal that respects the PSD cone and improves mixing near the boundary. This would be a concrete advance for uncertainty quantification in structured covariance estimation and multi-output graph learning, where Euclidean proposals are known to mix poorly.

major comments (1)

[Abstract / validation paragraph] The finite-difference validation of the Hessian-pullback identity (Abstract and the numerical verification paragraph) is reported exclusively for rank-one PSD edge perturbations at d=5. No algebraic derivation for general W_e or numerical check for full-rank cases and d>5 is provided; if any term in the pullback vanishes identically only in the rank-one setting, the claimed proposals and closed-form Jacobian would not hold for the general graphs the method targets.

minor comments (2)

[Abstract] The abstract states 'substantially higher ESS/sec' without quoting the actual ratios or reporting the number of chains, burn-in, and thinning used; these numbers should appear in the experimental tables or text.
[Introduction / model section] Notation for the block Laplacian L(W) and stabilizer R is introduced without an explicit definition of the edge kernels W_e in the main text; a short paragraph or equation block would improve readability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and insightful comments. Below we provide a point-by-point response to the major comment.

read point-by-point responses

Referee: [Abstract / validation paragraph] The finite-difference validation of the Hessian-pullback identity (Abstract and the numerical verification paragraph) is reported exclusively for rank-one PSD edge perturbations at d=5. No algebraic derivation for general W_e or numerical check for full-rank cases and d>5 is provided; if any term in the pullback vanishes identically only in the rank-one setting, the claimed proposals and closed-form Jacobian would not hold for the general graphs the method targets.

Authors: We agree that the finite-difference checks are limited to the rank-one case at d=5. The general algebraic derivation of the pullback identity is presented in the main body of the paper (see the paragraph following the definition of Φ(W) and the subsequent Hessian calculation), where we show using matrix calculus that ∇²Φ(W) equals the pullback metric without any rank restriction on W_e. The specific choice of rank-one perturbations for numerical verification was made to isolate the effect while keeping the computation tractable. Nevertheless, we recognize the value of broader validation and will add numerical experiments for full-rank W at d>5 (specifically d=10) as well as a dedicated appendix with the full derivation in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

Pullback Hessian identity is a direct derivation from definitions

full rationale

The paper's core step is the explicit computation that the Hessian of Φ(W) = −log det X(W) equals the pullback of the affine-invariant SPD metric under the linear map W ↦ X(W) = L(W) + R. This identity is obtained via the chain rule applied to the standard log-det Hessian on the SPD cone and the differential of the map X(W); it does not presuppose the result, fit any parameter to data, or rely on a self-citation whose content is the claim itself. The rank-one d=5 finite-difference checks are presented only as numerical confirmation of the already-derived formula, not as the justification for it. No load-bearing step reduces to its own input by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The derivation relies on standard properties of the affine-invariant metric on SPD matrices and the smoothness of the map from edge weights to the lifted precision matrix; no free parameters or new entities are introduced.

axioms (1)

domain assumption The map W ↦ X(W) = L(W) + R is smooth and the Hessian of −log det X(W) equals the pullback of the affine-invariant metric.
Invoked to obtain the intrinsic Langevin proposal; this is a standard fact from information geometry on SPD manifolds.

pith-pipeline@v0.9.0 · 5582 in / 1394 out tokens · 39860 ms · 2026-05-15T01:07:30.566214+00:00 · methodology

Geometry-Aware Langevin Sampling for Matrix-Valued Graph Learning

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)