arxiv: 2604.28154 · v1 · submitted 2026-04-30 · ✦ hep-ph

Recognition: unknown

Mapping data sensitivities in global QCD analysis with linear response and influence functions

Richard Whitehill

Authors on Pith no claims yet

Pith reviewed 2026-05-07 07:05 UTC · model grok-4.3

classification ✦ hep-ph

keywords global QCD analysislinear responseinfluence functionsdata sensitivityhadron structureinverse problemsgradient methodsquantum correlation functions

0 comments

The pith

Linear response and influence functions quantify how each data point shapes the central values, uncertainties, and correlations of QCD fits.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Global QCD analyses extract hadron structure from experimental data, yet the high-dimensional fits make it hard to trace how specific measurements drive the results. The paper develops a framework of linear response and influence functions, which are gradient-based sensitivity measures that track the flow of experimental information into the fitted quantities. These measures show directly how data sets the central values and uncertainties of quantum correlation functions along with the correlations among them. A sympathetic reader would care because the approach gives a transparent diagnostic for information flow in these inverse problems, making the origin of fit results easier to inspect and interpret.

Core claim

Here we develop a framework based on linear response and influence functions, which are gradient-based sensitivity measures that directly quantify how experimental information propagates to fitted quantities and observables. These quantities cleanly expose how data locally determines the central values and uncertainties of quantum correlation functions, as well as the correlations between them, providing a transparent and general framework for diagnosing information flow in inverse problems in QCD.

What carries the argument

Linear response and influence functions, gradient-based sensitivity measures that quantify how experimental information propagates to fitted quantities and observables.

If this is right

The method directly quantifies the contribution of each experiment to the central values of the fitted quantum correlation functions.
It maps how data determines both the uncertainties and the mutual correlations among those functions.
The framework supplies a general diagnostic for tracing information flow through any high-dimensional inverse problem in QCD.
Local gradient calculations replace the need for repeated full re-optimizations when testing individual data influences.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same gradient machinery could be used to rank the impact of proposed future experiments on particular observables before data are taken.
The linear-response view might be combined with existing uncertainty quantification tools to produce more localized error bands.
Similar sensitivity maps could be applied to other inverse problems in particle physics that rely on large global fits.

Load-bearing premise

Linear approximations around the best-fit point are assumed to accurately capture data sensitivities even in the high-dimensional and potentially nonlinear space of QCD fits.

What would settle it

A direct comparison of the linear predictions against the actual changes obtained by fully refitting the global analysis after removing or perturbing a single data set; large discrepancies in central values or uncertainties would show the approximation fails.

Figures

Figures reproduced from arXiv: 2604.28154 by Richard Whitehill.

**Figure 1.** Figure 1: FIG. 1: Hamiltonian Monte Carlo (HMC) results for the baseline global analysis. view at source ↗

**Figure 2.** Figure 2: FIG. 2: Response and influence function results for PDF parameters. view at source ↗

**Figure 3.** Figure 3: FIG. 3: Response of PDFs to individual data points, view at source ↗

**Figure 5.** Figure 5: FIG. 5: Influence of data points on the up–down PDF correlation view at source ↗

read the original abstract

Global QCD analyses provide the primary framework for extracting hadron structure from experimental data, yet the mechanisms by which data constrain non-perturbative functions remain difficult to interpret due to the high dimensionality and complexity of these fits. Here we develop a framework based on linear response and influence functions, which are gradient-based sensitivity measures that directly quantify how experimental information propagates to fitted quantities and observables. These quantities cleanly expose how data locally determines the central values and uncertainties of quantum correlation functions, as well as the correlations between them, providing a transparent and general framework for diagnosing information flow in inverse problems in QCD.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts influence functions to trace data impact in QCD fits but the linear approximation needs concrete checks against actual refits to hold up.

read the letter

The paper introduces a framework using linear response and influence functions to map how data constrains parton distributions in global QCD fits. This is the main new element: adapting these sensitivity tools to the inverse problem of extracting non-perturbative functions from collider data. It does well in setting up the general formalism. The definitions for how data perturbations affect central values, uncertainties, and correlations between observables seem logically derived from the fit procedure. This could indeed make the black-box nature of these analyses a bit clearer for practitioners. The soft spots are around the practical accuracy. The method assumes the linear approximation holds for the relevant data variations, but QCD fits involve nonlinear evolution and high-dimensional parameter spaces. Without examples comparing the influence predictions to actual refits after data removal or scaling, it's difficult to gauge how reliable the sensitivities are. The stress-test note on nonlinearity is a real issue here. The citation pattern looks fine for a methods paper, focusing on the core idea without overclaiming prior work. This work is for researchers who perform or rely on global QCD analyses and want better ways to interpret data impact. A reader with background in both PDF fitting and statistical influence methods would find it useful. It deserves a serious referee because the core idea has potential to improve analysis practices, even if it requires more demonstration to stand on its own. I would recommend sending it to peer review.

Referee Report

2 major / 2 minor

Summary. The manuscript develops a framework based on linear response theory and influence functions to quantify the propagation of experimental data information into the central values, uncertainties, and correlations of parton distribution functions (PDFs) and derived observables in global QCD analyses. The approach defines gradient-based sensitivity measures around the best-fit point to diagnose local data-to-PDF mappings without repeated refits.

Significance. If the linear approximations prove accurate for relevant data variations, the framework would offer a computationally efficient and interpretable tool for mapping information flow in high-dimensional QCD fits. This could aid in identifying which datasets constrain specific PDF features, improving uncertainty quantification, and guiding experimental design, constituting a useful methodological contribution to global analyses.

major comments (2)

[§3 (framework derivation) and §5 (numerical results)] The central claim that linear response and influence functions 'cleanly expose' data sensitivities assumes the first-order Taylor expansion around the minimum remains accurate. However, global QCD fits involve nonlinear DGLAP evolution, convolution integrals, and flexible parameterizations (typically 20-50 parameters) where the Hessian can be ill-conditioned. No explicit validation—such as comparing influence-function predictions to actual refits after finite data removal or rescaling—is presented to confirm the approximation's validity for typical perturbations.
[§4.1 (definition of influence functions)] The influence functions are defined via gradients of the existing fit procedure, but the manuscript does not address how regularization choices in the Hessian or tolerance criteria affect the resulting sensitivity measures, which could introduce systematic biases in the reported correlations.

minor comments (2)

[§2 (background)] Notation for the linear response operator and the influence function could be clarified with an explicit equation relating them to the Hessian and gradient of the chi-squared function.
[Figures 2-4] Figure captions should explicitly state the specific global fit (e.g., NNPDF or CT18) and data sets used in the demonstrations to allow reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their positive assessment of the significance of our work and for the constructive comments, which help strengthen the presentation of the linear response framework. We address each major comment below and indicate the revisions we will implement.

read point-by-point responses

Referee: [§3 (framework derivation) and §5 (numerical results)] The central claim that linear response and influence functions 'cleanly expose' data sensitivities assumes the first-order Taylor expansion around the minimum remains accurate. However, global QCD fits involve nonlinear DGLAP evolution, convolution integrals, and flexible parameterizations (typically 20-50 parameters) where the Hessian can be ill-conditioned. No explicit validation—such as comparing influence-function predictions to actual refits after finite data removal or rescaling—is presented to confirm the approximation's validity for typical perturbations.

Authors: We agree that explicit validation of the linear approximation is important for establishing the practical utility of the framework. The influence functions are formally exact to first order at the best-fit minimum, but we acknowledge that the original manuscript did not include direct numerical comparisons against refits. In the revised manuscript we will add a dedicated validation subsection in §5. This will include a limited set of explicit refits after small, controlled data perturbations (e.g., rescaling selected datasets by 5–10 % or removing a small number of points) and direct comparison of the resulting PDF shifts and uncertainty changes against the predictions obtained from the influence functions. We will also add a short discussion of the expected range of validity, noting that larger perturbations will eventually probe nonlinearities while the method remains intended for local sensitivity diagnostics around the minimum. This addition will directly address the concern about the ill-conditioned Hessian and nonlinear evolution. revision: yes
Referee: [§4.1 (definition of influence functions)] The influence functions are defined via gradients of the existing fit procedure, but the manuscript does not address how regularization choices in the Hessian or tolerance criteria affect the resulting sensitivity measures, which could introduce systematic biases in the reported correlations.

Authors: The influence functions are constructed from the gradient of the total χ² with respect to the data, evaluated using the inverse Hessian matrix obtained from the original global fit. Consequently, they inherit the same regularization scheme and tolerance criterion that were used to determine the best-fit point and its uncertainties. We will revise the text in §4.1 to make this dependence explicit, stating that the reported data-to-PDF sensitivities and correlations are those of the regularized fit. We will also add a brief paragraph discussing how changes in the tolerance criterion would rescale the overall sensitivity measures uniformly (while preserving relative rankings of datasets), and we will note that any systematic bias from regularization is already present in the baseline fit itself. If space allows, we can include a short numerical illustration of the effect of varying the tolerance parameter on a subset of the sensitivity maps. revision: partial

Circularity Check

0 steps flagged

No circularity in gradient-based sensitivity framework

full rationale

The paper defines influence functions and linear response measures directly from gradients of the existing global QCD fit procedure (chi^2 minimization with DGLAP evolution and PDF parameterizations). These quantities are constructed to compute data-to-observable sensitivities by design, without any reduction of a claimed prediction back to fitted inputs by construction, self-definition of central results, or load-bearing self-citations. The derivation chain is a standard application of first-order Taylor expansion and statistical influence functions to an inverse problem; it does not rename known results or smuggle ansatze via prior work. The linearity assumption is an explicit modeling choice whose validity is separate from circularity. No load-bearing step equates outputs to inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on standard assumptions from optimization and sensitivity analysis rather than introducing new free parameters or entities.

axioms (1)

domain assumption The objective function of the global QCD fit is differentiable with respect to both the fit parameters and the input data points.
Required to define gradients and influence functions as gradient-based sensitivity measures.

pith-pipeline@v0.9.0 · 5380 in / 1242 out tokens · 78980 ms · 2026-05-07T07:05:57.405503+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

61 extracted references · 55 canonical work pages · 2 internal anchors

[1]

Collins,Foundations of Perturbative QCD, Vol

J. Collins,Foundations of Perturbative QCD, Vol. 32 (Cambridge University Press, 2011)

2011
[2]

J. C. Collins, D. E. Soper, and G. F. Sterman, Adv. Ser. Direct. High Energy Phys.5, 1 (1989), arXiv:hep- ph/0409313

work page arXiv 1989
[3]

Cocuzza, T

C. Cocuzza, T. J. Hague, W. Melnitchouk, N. Sato, and A. W. Thomas, arXiv:2602.16589 [hep-ph] (2026)

work page arXiv 2026
[4]

Cocuzza, N

C. Cocuzza, N. T. Hunt-Smith, W. Melnitchouk, N. Sato, and A. W. Thomas (JAM Collaboration (Spin PDF Analysis Group)), Phys. Rev. D112, 114017 (2025), arXiv:2506.13616 [hep-ph]

work page arXiv 2025
[5]

Houet al., Phys

T.-J. Houet al., Phys. Rev. D103, 014013 (2021), arXiv:1912.10053 [hep-ph]

work page arXiv 2021
[6]

R. D. Ballet al.(NNPDF), Eur. Phys. J. C82, 428 (2022), arXiv:2109.02653 [hep-ph]

work page arXiv 2022
[7]

Bailey, T

S. Bailey, T. Cridge, L. A. Harland-Lang, A. D. Mar- tin, and R. S. Thorne, Eur. Phys. J. C81, 341 (2021), arXiv:2012.04684 [hep-ph]

work page arXiv 2021
[8]

Borsa, M

I. Borsa, M. Stratmann, W. Vogelsang, D. de Florian, and R. Sassot, Phys. Rev. Lett.133, 151901 (2024), arXiv:2407.11635 [hep-ph]

work page arXiv 2024
[9]

Cerutti, A

M. Cerutti, A. Accardi, I. P. Fernando, S. Li, J. F. Owens, and S. Park, Phys. Rev. D111, 094013 (2025), arXiv:2501.06849 [hep-ph]

work page arXiv 2025
[10]

Anderson, W

T. Anderson, W. Melnitchouk, and N. Sato (JAM Col- laboration (PDF Analysis Group)), Phys. Rev. D112, 094011 (2025), arXiv:2501.00665 [hep-ph]

work page arXiv 2025
[11]

de Florian, R

D. de Florian, R. Sassot, and M. Stratmann, Phys. Rev. D75, 114010 (2007), arXiv:hep-ph/0703242

work page arXiv 2007
[12]

R. A. Khalek, V. Bertone, and E. R. Nocera (MAP (Multi-dimensional Analyses of Partonic distributions)), Phys. Rev. D104, 034007 (2021), arXiv:2105.08725 [hep- ph]

work page arXiv 2021
[13]

P. C. Barryet al., arXiv:2510.13771 [hep-ph] (2025)

work page arXiv 2025
[14]

Bacchetta, V

A. Bacchetta, V. Bertone, C. Bissolotti, M. Cerutti, M. Radici, S. Rodini, and L. Rossi (MAP (Multi- dimensional Analyses of Partonic distributions)), Phys. Rev. Lett.135, 021904 (2025), arXiv:2502.04166 [hep- ph]

work page arXiv 2025
[15]

V. Moos, I. Scimemi, A. Vladimirov, and P. Zurita, JHEP 11, 134, arXiv:2503.11201 [hep-ph]

work page arXiv
[16]

Aslan, M

F. Aslan, M. Boglione, J. O. Gonzalez-Hernandez, T. Rainaldi, T. C. Rogers, and A. Simonelli, Phys. Rev. D110, 074016 (2024), arXiv:2401.14266 [hep-ph]

work page arXiv 2024
[17]

Y. Guo, F. P. Aslan, X. Ji, and M. G. Santiago, Phys. Rev. Lett.135, 261903 (2025), arXiv:2509.08037 [hep- ph]

work page arXiv 2025
[18]

Dotsonet al., arXiv:2504.13289 [hep-ph] (2025)

A. Dotsonet al., arXiv:2504.13289 [hep-ph] (2025)

work page arXiv 2025
[19]

Panjsheeri, D

Z. Panjsheeri, D. Q. Adams, A. Khawaja, S. Pandey, K. Tezgin, and S. Liuti, arXiv:2511.03065 [hep-ph] (2025)

work page arXiv 2025
[20]

R. D. Ball, V. Bertone, F. Cerutti, L. Del Debbio, S. Forte, A. Guffanti, J. I. Latorre, J. Rojo, and M. Ubiali (NNPDF), Nucl. Phys. B855, 153 (2012), arXiv:1107.2652 [hep-ph]

work page arXiv 2012
[21]

R. D. Ballet al.(NNPDF), Eur. Phys. J. C84, 517 (2024), arXiv:2401.10319 [hep-ph]

work page arXiv 2024
[22]

R. D. Ballet al., Nucl. Phys. B867, 244 (2013), arXiv:1207.1303 [hep-ph]

work page Pith review arXiv 2013
[23]

Yang and A

U.-K. Yang and A. Bodek, Eur. Phys. J. C13, 241 (2000), arXiv:hep-ex/9908058

work page arXiv 2000
[24]

J. F. Owens, A. Accardi, and W. Melnitchouk, Phys. Rev. D87, 094012 (2013), arXiv:1212.1702 [hep-ph]

work page arXiv 2013
[25]

L. A. Harland-Lang, T. Cridge, M. Reader, and R. S. Thorne, Eur. Phys. J. C86, 115 (2026), arXiv:2510.03753 [hep-ph]

work page arXiv 2026
[26]

Constraints on large-$x$ parton distributions from new weak boson production and deep-inelastic scattering data

A. Accardi, L. T. Brady, W. Melnitchouk, J. F. Owens, and N. Sato, Phys. Rev. D93, 114017 (2016), arXiv:1602.03154 [hep-ph]

work page Pith review arXiv 2016
[27]

Accardi, M

A. Accardi, M. E. Christy, C. E. Keppel, W. Mel- nitchouk, P. Monaghan, J. G. Morf´ ın, and J. F. Owens, Phys. Rev. D81, 034016 (2010), arXiv:0911.2254 [hep- ph]

work page arXiv 2010
[28]

Accardi, W

A. Accardi, W. Melnitchouk, J. F. Owens, M. E. Christy, C. E. Keppel, L. Zhu, and J. G. Morfin, Phys. Rev. D 84, 014008 (2011), arXiv:1102.3686 [hep-ph]

work page arXiv 2011
[29]

L. Kotz, A. Courtoy, T. J. Hobbs, P. Nadolsky, F. Olness, M. Ponce-Chavez, and V. Purohit, Comput. Phys. Com- mun.320, 109969 (2026), arXiv:2507.22969 [hep-ph]

work page arXiv 2026
[30]

Ablat, S

A. Ablat, S. Dulat, M. Guzzi, J. Huston, K. Mo- han, P. Nadolsky, D. Stump, and C. P. Yuan, arXiv:2512.23792 [hep-ph] (2025)

work page arXiv 2025
[31]

Collins, T

J. Collins, T. C. Rogers, and N. Sato, Phys. Rev. D105, 076010 (2022), arXiv:2111.01170 [hep-ph]

work page arXiv 2022
[32]

Y. Zhou, N. Sato, and W. Melnitchouk (Jefferson Lab Angular Momentum (JAM)), Phys. Rev. D105, 074022 (2022), arXiv:2201.02075 [hep-ph]

work page arXiv 2022
[33]

Candido, S

A. Candido, S. Forte, and F. Hekhorn, JHEP11, 129, arXiv:2006.07377 [hep-ph]

work page arXiv 2006
[34]

D’Alesio, C

U. D’Alesio, C. Flore, and A. Prokudin, Phys. Lett. B 803, 135347 (2020), arXiv:2001.01573 [hep-ph]

work page arXiv 2020
[35]

Gamberg, M

L. Gamberg, M. Malda, J. A. Miller, D. Pitonyak, A. Prokudin, and N. Sato (Jefferson Lab Angular Mo- mentum (JAM), Jefferson Lab Angular Momentum), Phys. Rev. D106, 034014 (2022), arXiv:2205.00999 [hep- ph]

work page arXiv 2022
[36]

Pumplin, D

J. Pumplin, D. Stump, R. Brock, D. Casey, J. Huston, J. Kalk, H. L. Lai, and W. K. Tung, Phys. Rev. D65, 014013 (2001), arXiv:hep-ph/0101032

work page arXiv 2001
[37]

R. D. Ball, L. Del Debbio, S. Forte, A. Guffanti, J. I. Latorre, A. Piccione, J. Rojo, and M. Ubiali (NNPDF), Nucl. Phys. B809, 1 (2009), [Erratum: Nucl.Phys.B 816, 293 (2009)], arXiv:0808.1231 [hep-ph]

work page arXiv 2009
[38]

Risse, N

P. Risse, N. Derakshanian, T. Jezo, K. Kovarik, and A. Kusina, arXiv:2510.16158 [hep-ph] (2025). 8

work page arXiv 2025
[39]

R. D. Ballet al.(NNPDF), JHEP04, 040, arXiv:1410.8849 [hep-ph]

work page Pith review arXiv
[40]

R. D. Ballet al.(NNPDF), Eur. Phys. J. C77, 663 (2017), arXiv:1706.00428 [hep-ph]

work page internal anchor Pith review arXiv 2017
[41]

G. P. Salam and J. Rojo, Comput. Phys. Commun.180, 120 (2009), arXiv:0804.3755 [hep-ph]

work page Pith review arXiv 2009
[42]

Watt and R

G. Watt and R. S. Thorne, JHEP08, 052, arXiv:1205.4024 [hep-ph]

work page arXiv
[43]

Carli et al., A posteriori inclusion of parton density functions in NLO QCD final-state calculations at hadron colliders: The APPLGRID project (2009), 0911.2985

T. Carli, D. Clements, A. Cooper-Sarkar, C. Gwenlan, G. P. Salam, F. Siegert, P. Starovoitov, and M. Sutton, Eur. Phys. J. C66, 503 (2010), arXiv:0911.2985 [hep-ph]

work page arXiv 2010
[44]

Stratmann and W

M. Stratmann and W. Vogelsang, Phys. Rev. D64, 114007 (2001), arXiv:hep-ph/0107064

work page arXiv 2001
[45]

B.-T. Wang, T. J. Hobbs, S. Doyle, J. Gao, T.-J. Hou, P. M. Nadolsky, and F. I. Olness, Phys. Rev. D98, 094030 (2018), arXiv:1803.02777 [hep-ph]

work page Pith review arXiv 2018
[46]

T. J. Hobbs, B.-T. Wang, P. M. Nadolsky, and F. I. Ol- ness, Phys. Rev. D100, 094040 (2019), arXiv:1904.00022 [hep-ph]

work page arXiv 2019
[47]

Jinget al., Phys

X. Jinget al., Phys. Rev. D108, 034029 (2023), arXiv:2306.03918 [hep-ph]

work page arXiv 2023
[48]

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

K. Simonyan, A. Vedaldi, and A. Zisserman, arXiv:1312.6034 [cs.CV] (2014)

work page internal anchor Pith review arXiv 2014
[49]

Giordano, T

R. Giordano, T. Broderick, and M. Jordan, arXiv:1506.04088 [stat.ML] (2015)

work page arXiv 2015
[50]

Ruelle, Nonlinearity22, 855–870 (2009)

D. Ruelle, Nonlinearity22, 855–870 (2009)

2009
[51]

Baehrens, T

D. Baehrens, T. Schroeter, S. Harmeling, M. Kawanabe, K. Hansen, and K.-R. Mueller, arXiv:0912.1128 [stat.ML] (2009)

work page arXiv 2009
[52]

P. W. Koh and P. Liang, arXiv:1703.04730 [stat.ML] (2020)

work page arXiv 2020
[53]

J. H. Lee, M. Smith, M. Adam, and J. Hoogland, arXiv:2510.12071 [cs.LG] (2025)

work page arXiv 2025
[54]

Giordano, W

R. Giordano, W. Stephenson, R. Liu, M. I. Jordan, and T. Broderick, A swiss army infinitesimal jackknife (2020), arXiv:1806.00550 [stat.ME]

work page arXiv 2020
[55]

Stump, J

D. Stump, J. Pumplin, R. Brock, D. Casey, J. Huston, J. Kalk, H. L. Lai, and W. K. Tung, Phys. Rev. D65, 014012 (2001), arXiv:hep-ph/0101051

work page arXiv 2001
[56]

R. P. Feynman, Phys. Rev. Lett.23, 1415 (1969)

1969
[57]

J. D. Bjorken, Phys. Rev.179, 1547 (1969)

1969
[58]

J. D. Bjorken and E. A. Paschos, Phys. Rev.185, 1975 (1969)

1975
[59]

C. G. Callan and D. J. Gross, Phys. Rev. Lett.21, 311 (1968)

1968
[60]

The Theory of Deeply Inelastic Scattering

J. Blumlein, Prog. Part. Nucl. Phys.69, 28 (2013), arXiv:1208.6087 [hep-ph]

work page Pith review arXiv 2013
[61]

A. D. Cobb and B. Jalaian, arXiv (2020), arXiv:2010.06772 [stat.ML]

work page arXiv 2020