Recognition: 3 theorem links
· Lean TheoremExtracting Transport Properties of Quark-Gluon Plasma from the Heavy-Quark Potential With Neural Networks in a Holographic Model
Pith reviewed 2026-05-06 21:12 UTC · model claude-opus-4-7
The pith
A neural network reads off a holographic metric from the lattice quark-antiquark potential and reuses it to predict heavy-quark transport in the quark-gluon plasma.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors fit a holographic background to lattice QCD heavy-quark potential data using a Kolmogorov-Arnold Network, which yields a closed-form expression for the deformation factor w(r) of an Andreev-Zakharov-type AdS5 metric and a value g ≈ 0.2573 for the string-tension constant. The same learned w(r) is then carried over, without further tuning, to compute the heavy-quark drag force, momentum diffusion coefficient, and jet quenching parameter at finite temperature and chemical potential, and these transport quantities come out qualitatively consistent with lattice and RHIC/LHC results.
What carries the argument
A Kolmogorov-Arnold Network used as an inverse-problem solver for a bottom-up holographic metric: the network's learnable univariate activations let it express w(r) in closed form (here, w(r) = 2.94 sin(1.24 sin(0.47r + 1.94) − 9.55) + 3.53), so the trained model is a symbolic geometry rather than a black box, and the same geometry is then reused to compute drag, diffusion, and jet quenching.
If this is right
- A geometry trained only on the static heavy-quark potential carries enough information to reproduce, qualitatively, drag, diffusion, and jet quenching across T and μ.
- The deconfinement line in the (T, μ) plane derived from this learned background ends near μ ≈ 0.93 GeV, broadly compatible with effective-model phase diagrams.
- Closed-form w(r) makes the holographic model auditable: one can read off, differentiate, and analytically continue the metric rather than query a network.
- The training cost for the KAN is reported as roughly an order of magnitude lower than for a comparable multilayer perceptron at a lower final loss, suggesting KANs are practical for inverse holography.
- The same procedure can in principle absorb additional QCD inputs (equation of state, meson spectrum) into a single holographic background.
Where Pith is reading between the lines
- Because w(r) is fit only to the zero-T potential, the agreement of the derived transport coefficients with finite-T lattice data is effectively a consistency test of the Andreev-Zakharov ansatz, not of the network — the KAN is doing symbolic regression inside a fixed metric class.
- The specific symbolic form (nested sines plus a constant) has no obvious physical motivation and is likely one of many near-degenerate minima; reporting the spread over retrainings would clarify how much of w(r) is data-driven versus architecture-driven.
- Carrying the same w(r) into the (T, μ) sector implicitly assumes the deformation factor is independent of the blackening function f(r); relaxing that — letting w depend on the thermal state — is a natural next test.
- The framework invites a direct head-to-head with Einstein-Maxwell-Dilaton holographic models that solve the bulk equations of motion: KANs here bypass the equations of motion, which is a feature for fitting but a liability for predictivity.
Load-bearing premise
That a single deformation factor w(r) fitted only to the zero-temperature static potential genuinely encodes the right physics for finite-T transport, and that the Andreev-Zakharov-style AdS5-RN ansatz with this w(r) is rich enough to represent real QCD rather than just to interpolate it.
What would settle it
Use the learned w(r) and g to predict a quantity that did not enter the training — for example, the heavy-quark diffusion coefficient 2πTD at T ≈ 1.5 Tc, or q̂/T³ at LHC temperatures — and compare to current lattice and Bayesian extractions. A persistent quantitative disagreement beyond the stated error bands would show the single-w(r) ansatz cannot simultaneously fit the static potential and the transport sector.
read the original abstract
Using Kolmogorov-Arnold Networks (KANs), we construct a holographic model informed by lattice QCD data. This neural network approach enables the derivation of an analytical solution for the deformation factor $w(r)$ and the determination of a constant $g$ related to the string tension. Within the KANs-based holographic framework, we further analyze heavy quark potentials under finite temperature and chemical potential conditions. Additionally, we calculate the drag force, jet quenching parameter, and diffusion coefficient of heavy quarks in this paper. Our findings demonstrate qualitative consistency with both experimental measurements and established phenomenological model.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The authors use a Kolmogorov-Arnold Network (KAN) to fit the deformation factor w(r) and the string-tension-like constant g of an Andreev-Zakharov-type holographic background to lattice/Cornell heavy-quark-potential data at T=0. They report a closed-form analytical w(r)=2.94 sin(1.24 sin(0.47 r+1.94)−9.55)+3.53 with g=0.2573, then use this background, augmented with the AdS-RN blackening factor f(r) of Eqs. (2)–(3), to compute the heavy-quark potential at finite T and μ, the Polyakov loop and a deconfinement line Tc(μ), the drag force, energy loss for c and b quarks, the spatial diffusion coefficient 2πTD, and the jet quenching parameter q̂/T³. Results are presented as qualitatively consistent with lattice and RHIC/LHC determinations and with previous EMD-based holographic computations.
Significance. The work is a useful, concrete demonstration that KANs can produce compact, closed-form approximants for holographic metric functions in inverse problems, with reported training-time advantages over MLPs (851 s vs. 15991 s to comparable loss). Delivering an analytical w(r) is genuinely helpful for downstream observables: it lets the same background be reused for potentials, Polyakov loop, drag, diffusion, and q̂ in a single pipeline, which is a methodological contribution beyond Refs. [108,109]. The finite-T and finite-μ phenomenology is qualitatively consistent with HotQCD-type lattice results for 2πTD and with JET/Bayesian q̂. The paper is best read as a proof-of-concept that ML-assisted holography can be made end-to-end and reproducible (code is referenced on GitHub), rather than as a quantitative determination of QGP transport. Within that scope it is a sensible incremental addition to the ML-holography literature.
major comments (5)
- [§II, training target and uniqueness of w(r)] The KAN is trained against the Cornell form E=0.8404 L−0.0866/L+0.1033, i.e. effectively three numbers, with five internal KAN parameters plus a free g and soft constraints. Andreev-Zakharov reproduce the same Cornell-type behavior with w=exp(sr²), s=0.45, g=0.176, whereas the present work obtains g=0.2573 — a ~46% shift in the string-tension-related constant from essentially the same input. The manuscript should (i) quantify the uniqueness of the (w,g) pair, e.g. by reporting the loss landscape or by retraining with different KAN initializations/architectures and showing the spread in the inferred g and in transport observables, and (ii) discuss why g differs so substantially from the standard AZ value while fitting the same potential. As written, the inverse problem appears underdetermined and this directly affects all downstream numbers.
- [§III, Eqs. (8)–(12): radial domain of validity] Drag, diffusion and q̂ are evaluated at r=r_s (with f(r_s)=v²) and via integrals up to r=r_h. As T grows, r_h decreases and the relevant radial region moves into the deep IR, beyond the range that the T=0 Cornell data actually constrain (roughly the range mapped from L∈[0,1.2] fm via Eq. (5)). The trained sin-of-sin ansatz is bounded and oscillatory and its IR behavior is largely set by the basis, not by data. The authors should show explicitly, for each figure in Figs. 6–10, the radial interval (r_s,…,r_h) actually probed and overlay it on the data-constrained range of w(r). Without this, the temperature dependence of the transport coefficients is an extrapolation of an underdetermined fit and the agreement with RHIC/LHC q̂ is consistent with a broad class of monotonic backgrounds.
- [§II, Fig. 5 and Eqs. (3)–(4)] The deconfinement curve Tc(μ) ending at μ≈0.93 GeV is presented as qualitatively consistent with NJL/PQM expectations, but it is derived from a w(r) that has not been constrained at finite μ at all; Eq. (3) simply substitutes q→μ-dependent form. The authors should clarify what physical input fixes the μ-dependence beyond the AdS-RN ansatz, and whether the location of the endpoint depends on the KAN training (e.g., on the regularization weights or on the seed), since this is the only finite-μ prediction shown.
- [§II vs §III, role of f(r) at large T] Eqs. (5)–(6) for L and E at finite T, and Eqs. (8)–(12) for transport, all use f(r) from Eq. (3) with w(r) trained at T=0 (f≡1). The implicit assumption that w(r) is T- and μ-independent is standard in AZ-type bottom-up models but is also the assumption that does the bulk of the work in extending to finite T,μ. This should be stated explicitly as an assumption in the abstract/introduction and its limitations addressed; otherwise readers may interpret the agreement with HotQCD 2πTD and JET q̂ as a successful prediction rather than a built-in feature of any AZ-type background with reasonable IR behavior.
- [§III, comparison to JETSCAPE and RHIC/LHC] Figs. 9–10 overlay the model with JETSCAPE and the JET/Bayesian extractions of q̂ but no error or sensitivity bands are shown for the model curve. Given that g, w(r), and the boundary conditions all carry training uncertainty, a band reflecting at minimum the variation across reasonable retrainings (or across loss-weight choices) should be displayed. As is, the qualitative agreement claim is hard to assess.
minor comments (10)
- [Abstract / §I] The phrase 'the determination of a constant g related to the string tension' should specify that g is determined by a fit to a Cornell parametrization of lattice data, not derived independently.
- [Eq. (4)] The Hawking temperature formula T=(1/πr_h)(1−μ²r_h²/2) is given without derivation; please indicate the change of variables from q to μ used in going from Eq. (2) to Eq. (3) and confirm dimensional conventions (r is stated dimensionless with L_AdS=1 — clarify how μ acquires GeV units in Figs. 3,5).
- [Eq. (6)] The counterterm structure −2g w(0)/r0 + 2g w'(0) ln(r0) is asserted; a short justification or reference for this specific subtraction would help the reader, especially since w'(0) is not directly visible in the analytic form given.
- [§II, w(r) functional form] The closed-form sin-of-sin expression should be accompanied by a plot of its first and second derivatives, and by an explicit check that w'(0)≥0 (used implicitly in the regularization).
- [Fig. 1] The KAN architecture sketch is too schematic to be reproducible from the paper alone; please give the layer widths, spline orders, and pruning thresholds in the caption or an appendix.
- [Fig. 7] Axes should carry units (dE/dx in GeV/fm? p in GeV?), and the choice T=1.1 Tc vs 2 Tc should be motivated.
- [Fig. 8] The lattice points labeled Nf=0 and Nf=2+1 should cite the specific references in the caption and indicate the systematic-uncertainty status; currently only Ref. [55] is cited in text.
- [§IV] The runtime comparison (851 s vs 15991 s) is informative but the loss values (0.0328 vs 0.0847) are not directly comparable unless the loss definitions and weights are identical; please state this explicitly.
- [Typography] Several figure axis labels render as control characters in the source (e.g. Figs. 3, 4, 6, 9, 10); please regenerate the figures with proper Unicode/LaTeX axis labels.
- [References] Ref. [109] is cited as 'our recent work' on emergent metrics with Neural ODEs; the relationship of the present KAN approach to that earlier MLP-based work, and to Ref. [108] on KAN vs MLP for the inverse potential problem, should be stated more sharply in the introduction to delineate novelty.
Simulated Author's Rebuttal
We thank the referee for a careful and constructive report. The five major comments converge on a coherent and legitimate concern: that our inverse problem is underdetermined, that the finite-T,μ extension rests on an unmodified T=0 w(r), and that the quoted \"qualitative agreement\" with lattice/JET data is presented without uncertainty quantification. We accept these criticisms. The revised manuscript will (1) add an ensemble study quantifying the spread in g, w(r), and downstream observables under retraining and loss-weight variations; (2) annotate every transport-coefficient figure with the radial window (r_s, r_h) actually probed, overlaid on the data-constrained range, so the reader can see where the result is interpolation versus IR extrapolation; (3) state explicitly — in the abstract, introduction, and §III — that w(r) is fixed at T=μ=0 and that finite-T,μ enters only through the AdS-RN factor f(r), with limitations spelled out; (4) attach uncertainty bands to Figs. 9–10 reflecting training systematics; and (5) discuss explicitly the g↔w(r) degeneracy that explains the shift from the standard AZ value g=0.176 to our g=0.2573. We believe these revisions sharpen the paper's scope as a methodological proof-of-concept for end-to-end ML-assisted holography, in line with the referee's framing, without overclaiming a quantitative determination of QGP transport.
read point-by-point responses
-
Referee: Uniqueness of (w,g): KAN training yields g=0.2573 vs. AZ's 0.176 from essentially the same Cornell input; report loss landscape, retraining spread, and explain the ~46% shift.
Authors: We agree this is the central concern and will address it in the revised version. (i) We will add an appendix presenting an ensemble of retrainings (varying KAN seed, grid size, and the relative weight of the boundary/monotonicity penalties) and report the resulting spread in g, in the parameters of w(r), and in the downstream transport observables (drag, 2πTD, q̂/T³). Preliminary checks indicate that g lies in a narrow band (≈0.24–0.28) once the UV constraint w(0)→1 and monotonicity are enforced, but we will document this quantitatively rather than asserting it. (ii) Concerning the shift from the AZ value g=0.176: the AZ fit is performed with the fixed Gaussian ansatz w=exp(sr²) and g and s are determined jointly. Our KAN allows a more flexible w(r) and the Cornell coefficients we target (0.8404, 0.0866) are not identical to those used in the original AZ fits to the meson spectrum. Because the linear (long-distance) part of E scales jointly with g and the IR slope of w(r), a different functional family redistributes the string tension between g and w′(r) at large r, which is precisely the underdetermination the referee points out. We will add an explicit discussion of this g↔w(r) degeneracy and quote the product g·w(r→r_max) as the more nearly invariant combination. revision: yes
-
Referee: Radial domain of validity: at finite T transport observables probe r_s…r_h, which lies beyond the range constrained by T=0 Cornell data; the sin-of-sin ansatz extrapolates into the IR set by the basis, not the data.
Authors: This is a fair criticism and we will address it directly. For each panel of Figs. 6–10 we will (a) annotate the (r_s, r_h) interval actually probed at the relevant T (and v), (b) overlay this interval on the radial range mapped from L∈[0,1.2] fm via Eq. (5) at T=0, which is the genuinely data-constrained region, and (c) shade the IR portion that constitutes extrapolation. We will also state explicitly in §III that, beyond the data-constrained window, the transport results inherit the IR behavior of the trained basis and should be read as a self-consistent prediction of the AZ-type framework with our particular w(r), not as a data-driven determination. We agree with the referee that qualitative agreement with RHIC/LHC q̂ is shared by a broad class of monotonic backgrounds; this will be acknowledged in the discussion. revision: yes
-
Referee: Tc(μ) endpoint at μ≈0.93 GeV is derived from w(r) trained only at μ=0; clarify what fixes the μ-dependence and whether the endpoint depends on KAN seed/regularization.
Authors: The referee is correct that the μ-dependence in our framework is not independently constrained: it enters solely through the AdS-RN blackening factor f(r) of Eq. (3) (with the standard q↔μ relation), while w(r) is held fixed at its T=μ=0 form. We will state this explicitly in §II and in the abstract/introduction. We will also add a sensitivity study showing how the endpoint of the Tc(μ) curve depends on (i) the KAN training seed and regularization weights, and (ii) the precise definition used to identify the deconfinement temperature from the Polyakov-loop slope. We expect the endpoint to shift by an O(10–20%) amount under these variations and will report the band rather than a single value, and we will downgrade the claim from "qualitative agreement with NJL/PQM" to "qualitatively similar shape, with the endpoint location subject to the stated systematics." revision: yes
-
Referee: The assumption that w(r) is T- and μ-independent does the heavy lifting; it should be stated explicitly as an assumption with limitations.
Authors: We accept this point. The revised abstract and introduction will state explicitly that w(r) is determined from T=μ=0 data and assumed independent of T and μ, with finite-T,μ effects entering only through f(r) — the standard AZ/bottom-up assumption. A short paragraph in §III will discuss the limitations: in particular, that screening and medium-modification of the long-range string tension cannot be captured by an unmodified w(r), and that consequently the qualitative agreement with HotQCD 2πTD and JET q̂ should not be interpreted as an independent ML-driven prediction, but as the behavior of an AZ-type background with a data-fitted UV/intermediate region and a smooth, monotonic IR completion. We thank the referee for forcing us to be precise on this. revision: yes
-
Referee: Figs. 9–10: no error/sensitivity bands on model curves; provide bands reflecting training-uncertainty spread.
Authors: We will add uncertainty bands to Figs. 9 and 10 (and, where relevant, to Figs. 6–8). The bands will be constructed from the ensemble of retrainings described in our response to comment 1 — varying KAN seed, network width/grid, and the loss-weight combinations for the UV boundary and monotonicity penalties — and will represent the envelope of resulting predictions. We will additionally include, in Fig. 10, a band reflecting the propagation of the dominant uncertainty in g. We agree that without these bands the "qualitative agreement" claim is hard to assess quantitatively, and we will phrase the comparison accordingly. revision: yes
- We cannot promise that the spread in transport observables across retrainings will be small; if the ensemble study reveals large variability, the conclusions of §III will be weakened accordingly and we will report this honestly rather than tune it away.
- We cannot, within the present T-/μ-independent w(r) framework, provide an independent physical input that fixes the μ-dependence; the endpoint of Tc(μ) genuinely inherits the AdS-RN ansatz and we will not claim otherwise.
Circularity Check
Mild circularity: the T=0 "prediction" reproduces the Cornell fit it was trained on; finite-T transport claims are extrapolations of an underdetermined fit but not strictly circular.
specific steps
-
fitted input called prediction
[Section II, Fig. 2(b)]
"the mean absolute error... quantifies the discrepancy between model predictions of the heavy-quark potential calculated by Eq. (6) and the target values from lattice QCD data... using the Cornell potential E = 0.8404L − 0.0866/L + 0.1033 as the target value... Fig. 2 (b) illustrates the fitting performance... These curves clearly demonstrate the excellent ability of the network in modeling the function E(L)... verifies the reliability of the KANs."
The loss is the MAE between Eq. (6) and the Cornell fit to lattice data [130]. Fig. 2(b) then compares the trained model to the same lattice data and presents agreement as validation. This is a fit reproducing its training target; it shows the KAN can represent E(L), not that the holographic model has predictive content. Framing conflates fit quality with model validity.
-
ansatz smuggled in via citation
[Section I, baseline w(r) ansatz]
"In the Andreev-Zakharov model, the parameter g (related to the string tension) is fixed as g = 0.176, while the deformation factor w(r) takes the form w(r) = e^{sr^2} with s = 0.45. These values are determined from fits to the meson spectrum [127, 128] and the Cornell potential [21, 129]."
The deformed AdS5-RN class with a multiplicative w(r) and its calibration philosophy are imported from Andreev-Zakharov. The KAN replaces e^{sr^2} but recalibrates to the same Cornell target, so the 'analytical solution for w(r)' inherits identifiability problems from the prior framework rather than resolving them. The very different g (0.2573 vs 0.176) recovered from the same data class shows non-uniqueness.
full rationale
The pipeline is: (i) train a KAN form of w(r) and string-tension constant g against a Cornell-form target E = 0.8404 L − 0.0866/L + 0.1033 (itself a fit to the 2-flavor lattice data of Kaczmarek–Zantow [130]); (ii) plug w(r) into Andreev-Zakharov formulas to compute finite-T,μ heavy-quark potential, drag force, diffusion coefficient, and jet quenching; (iii) compare to lattice/experiment. One mildly circular element: Fig. 2(b) and surrounding text frame the KAN as making "predictions" of the heavy-quark potential and shows agreement with the same lattice data the loss minimized against. This is a fit being validated against itself; it tests representational capacity of the KAN, not predictive power of the holographic ansatz. Authors are transparent that it is a reconstruction, but the phrasing "verifies the reliability of the KANs" conflates fit quality with predictivity. Beyond this, the skeptic's worries (non-unique w(r) at T=0 — Andreev-Zakharov's e^{sr^2} with g=0.176 fits the same data, this paper finds g=0.2573 with a sin-of-sin form; finite-T transport evaluates w at r_s and r_h that drift into IR regions the T=0 training never constrains) are real correctness/identifiability issues but are extrapolation, not circularity by definition. Self-citations to Refs. [108, 109, 119] are present but not load-bearing for uniqueness claims; they motivate methodology. The transport formulas (Eqs. 8–12) are from independent prior literature (Gubser; Herzog et al.; Liu–Rajagopal–Wiedemann; Andreev), not author self-citations. So overall circularity stays low.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith.Cost (Jcost)washburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
w(r) = 2.94 sin(1.24 sin(0.47 r + 1.94) − 9.55) + 3.53, with g=0.2573, fit to Cornell potential E = 0.8404 L − 0.0866/L + 0.1033
-
IndisputableMonolith.Unification.YangMillsMassGapmassGap = (√5−2)/2 (spectral_gap) unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
drag force, jet quenching parameter q̂, and diffusion coefficient computed via standard holographic formulas with the KAN-extracted w(r)
-
IndisputableMonolith.Unification.SpacetimeEmergencelorentzian_signature unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
f(r) = 1 − (1/r_h^4 + μ²/r_h²) r^4 + (μ²/r_h^4) r^6; AdS/RN background with deformation factor
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.