Deep Ranking with Heterogeneous Effects

Ruijian Han; Shuxing Fang; Yiming Xu; Yuanhang Luo

arxiv: 2604.16129 · v2 · submitted 2026-04-17 · 📊 stat.ME

Deep Ranking with Heterogeneous Effects

Yuanhang Luo , Shuxing Fang , Ruijian Han , Yiming Xu This is my paper

Pith reviewed 2026-05-10 08:04 UTC · model grok-4.3

classification 📊 stat.ME

keywords ranking modelssemiparametric estimationneural network approximationidentifiabilityminimax optimalitylatent scorescovariate effects

0 comments

The pith

Semiparametric ranking separates intrinsic utilities from nonlinear covariate effects with optimal rates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Classical latent-score models cannot separate an object's fixed ranking strength from the influence of contextual covariates that often act nonlinearly. This paper introduces a framework in which the log-score of each object equals a parametric utility plus a nonparametric function of observed covariates. It proves the decomposition is identifiable under mild regularity and connectivity conditions on the data. Estimation approximates the nonparametric part with a neural network and maximizes the likelihood, delivering non-asymptotic error bounds that match the minimax rate for both the parametric and nonparametric pieces. Readers care because the approach lets ranking systems credit intrinsic quality while correctly adjusting for varying conditions such as player form or match settings.

Core claim

The central claim is that a semiparametric ranking model with log-score equal to a utility parameter plus nonparametric covariate effects is identifiable under mild regularity and connectivity conditions. Maximum-likelihood estimation that approximates the nonparametric component by a neural network produces an estimator that exists with high probability under random design and attains minimax-optimal non-asymptotic error bounds simultaneously for the parametric utilities and the nonparametric functions.

What carries the argument

semiparametric log-score model that adds a nonparametric covariate effect to a parametric utility, approximated by neural network and estimated by maximum likelihood

If this is right

Intrinsic utilities remain identifiable even when covariate effects dominate observed outcomes.
The estimator exists with high probability and achieves the best possible rates for both parametric and nonparametric parts.
The framework applies directly to paired comparison data such as tennis match results.
Nonparametric effects can be learned without sacrificing rate optimality for the utilities.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The separation of intrinsic score from context could improve ranking accuracy in recommendation or sports systems where conditions change across observations.
Replacing the neural network with other nonparametric estimators might preserve the same optimality results under similar assumptions.
The connectivity condition suggests that dense comparison graphs are needed in practice for reliable recovery of both components.

Load-bearing premise

The data must follow random design and the comparison graph must be sufficiently connected so that utilities and covariate functions can be uniquely recovered.

What would settle it

Generate synthetic ranking data from the model with known utility values and known covariate functions, then check whether the estimator recovers both components inside the stated error bounds; consistent failure on connected designs would falsify the optimality claim.

Figures

Figures reproduced from arXiv: 2604.16129 by Ruijian Han, Shuxing Fang, Yiming Xu, Yuanhang Luo.

**Figure 2.** Figure 2: Regimes of error dominance. The parameter space is divided into two regions [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗

**Figure 3.** Figure 3: Trace plot of ∥ub − u ∗∥ΛQ and ∥ ¯f ϕb − f ∗∥L2(X) as n growing under different graph density regimes. Results are averaged over 300 replications [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗

**Figure 4.** Figure 4: (Left) Scatter plot of the estimated utility [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗

**Figure 5.** Figure 5: Trace plot of ∥ub − u ∗∥ΛQ and ∥ ¯f ϕb − f ∗∥L2(X) as n growing under smoothness parameter β of f ∗ . Results are averaged over 300 replications. 6.2 ATP Tennis Data The sports analytics platform Sofascore2 contains professional tennis tournament records of all levels (ATP 250, 500 and 1000, ATP Finals, and Grand Slams) between 2016 and 2025. We collect available information from this platform to construct… view at source ↗

**Figure 6.** Figure 6: Test performance of DHR versus baselines. ( [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗

**Figure 7.** Figure 7: Player scores estimated by DHR. (a) Time-varying trajectories of selected players against the “Big Three” benchmarks. (b) Average estimated scores across temporal periods. Error bars represent one standard deviation within each period. Notably, C. Alcaraz and J. Sinner won multiple Grand Slam titles. C. Alcaraz rose to World No. 1 in 2022 and Sinner reached World No. 1 in 2024. This rapid rise is captured … view at source ↗

read the original abstract

Classical latent-score ranking models often fail to distinguish objects' intrinsic scores from contextual effects, which are typically nonlinear and can dominate the observed outcomes. To address this, we introduce a semiparametric ranking framework in which the log-score of each object is modeled as the sum of a utility parameter and a nonparametric covariate effects. Within this framework, we establish model identifiability under mild regularity and connectivity conditions. For estimation, we approximate the covariate effects using a neural network and estimate the parameters via maximum likelihood. Under random design assumptions, we prove that the resulting estimator exists with high probability and derive non-asymptotic error bounds that achieve minimax optimality for both the parametric and nonparametric components. Numerical experiments on both synthetic data and an ATP tennis dataset are conducted to support our findings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper adds neural-net covariate effects to fixed-utility ranking models and claims identifiability plus minimax-optimal rates under random design.

read the letter

This paper's main contribution is a semiparametric model for ranking that adds neural network-modeled covariate effects to fixed utility parameters, along with identifiability results and optimal error bounds. They model the log-score as utility plus a nonparametric term for covariates, use NN to approximate the nonparametric part, and estimate via MLE. They prove the model is identifiable under regularity and connectivity conditions, and get non-asymptotic bounds that match minimax rates for both the parametric utilities and the nonparametric effects under random design. What stands out is how they handle the separation of intrinsic scores from contextual effects, which classical models often mix up. The theoretical part seems solid on paper, with specific bounds rather than just existence. The synthetic and ATP tennis experiments are there to illustrate. On the downside, the random design assumption is key and may not always hold in practice, like in fixed tournament schedules. The connectivity condition for identifiability needs checking in applications. Without the full proofs, it's hard to see exactly how the NN approximation error is bounded without blowing up the rates. The real-data part supports the idea but doesn't deeply explore the heterogeneous effects. This is aimed at statisticians and ML researchers working on ranking systems or semiparametric estimation. Someone looking for theoretical backing on hybrid models would find it useful. It has enough structure and claims to merit peer review rather than a desk reject. I'd recommend sending it to a stats or ML theory journal for feedback.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces a semiparametric ranking framework in which each object's log-score is the sum of a parametric utility parameter and a nonparametric covariate effect. It claims to prove identifiability of the model under mild regularity and connectivity conditions, proposes maximum-likelihood estimation after approximating the nonparametric component with a neural network, and derives non-asymptotic error bounds that attain minimax optimality for both the parametric and nonparametric parts under random-design assumptions. Synthetic experiments and an application to an ATP tennis dataset are presented to illustrate the method.

Significance. If the identifiability result and the minimax-optimal non-asymptotic bounds hold, the work would supply a theoretically grounded approach to separating intrinsic scores from heterogeneous contextual effects in ranking problems. The combination of semiparametric modeling with neural-network approximation, together with explicit rates for both components, would be a useful addition to the literature on ranking and semiparametric estimation.

major comments (2)

[Abstract and theoretical results section] The central non-asymptotic bounds (claimed in the abstract to be minimax optimal for both components) rest on controlling the neural-network approximation error together with the semiparametric MLE error. The precise dependence of the bound on network width, depth, and the smoothness class of the covariate effect must be stated explicitly; without this, it is impossible to verify that the claimed rate is indeed minimax and not degraded by the approximation step.
[Identifiability theorem] Identifiability is asserted under 'mild regularity and connectivity conditions.' These conditions appear to include a connectivity requirement on the ranking graph induced by the observed comparisons. If the observed comparison graph is sparse or disconnected (common in real ranking data such as the ATP dataset), the conditions may fail; the paper should clarify whether the connectivity assumption is automatically satisfied by the random-design model or requires additional verification.

minor comments (2)

[Model and estimation sections] Notation for the neural-network approximator (e.g., the class of networks, activation functions, and parameter constraints) should be introduced once and used consistently; currently the abstract and later sections appear to use slightly different symbols for the same objects.
[Numerical experiments] The description of the ATP tennis dataset (number of matches, players, covariates used) is brief; a short table summarizing the data characteristics would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below and will revise the manuscript to improve clarity on the theoretical details.

read point-by-point responses

Referee: [Abstract and theoretical results section] The central non-asymptotic bounds (claimed in the abstract to be minimax optimal for both components) rest on controlling the neural-network approximation error together with the semiparametric MLE error. The precise dependence of the bound on network width, depth, and the smoothness class of the covariate effect must be stated explicitly; without this, it is impossible to verify that the claimed rate is indeed minimax and not degraded by the approximation step.

Authors: We appreciate this observation. Our non-asymptotic bounds are obtained by balancing the neural-network approximation error (controlled via width, depth, and the Hölder smoothness class of the covariate effect) against the semiparametric MLE error term. The resulting rates attain the minimax optimum for both components when the network size is chosen to match the smoothness, without degradation from approximation. To make verification straightforward, we will revise the abstract and the statement of the main theorem to explicitly display the dependence on network width, depth, and smoothness class. revision: yes
Referee: [Identifiability theorem] Identifiability is asserted under 'mild regularity and connectivity conditions.' These conditions appear to include a connectivity requirement on the ranking graph induced by the observed comparisons. If the observed comparison graph is sparse or disconnected (common in real ranking data such as the ATP dataset), the conditions may fail; the paper should clarify whether the connectivity assumption is automatically satisfied by the random-design model or requires additional verification.

Authors: We thank the referee for this important clarification request. The connectivity condition on the comparison graph is required for unique identification of the utility parameters. Under the random-design model (independent comparisons with positive probability), standard random-graph arguments imply that the graph is connected with high probability for large samples. For the ATP dataset we verify post hoc that the observed matches induce a connected graph. We will add a remark in the identifiability section clarifying that the random-design assumption ensures connectivity with high probability and how to check the condition in practice. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The derivation chain rests on establishing identifiability of the semiparametric model (log-score = utility parameter + nonparametric covariate effect) under explicitly stated mild regularity and connectivity conditions, followed by existence and non-asymptotic minimax bounds for the neural-network MLE under random-design assumptions. These steps invoke standard external statistical assumptions and do not reduce any claimed result to a fitted parameter, self-citation, or definitional tautology by construction. The central claims retain independent content from the model specification and proof techniques.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard regularity conditions for identifiability in semiparametric models and random-design assumptions for convergence rates; no new entities are postulated and the utility parameters are estimated rather than invented.

free parameters (1)

utility parameters
Object-specific intrinsic scores estimated via maximum likelihood; these are the parametric component of the model.

axioms (2)

domain assumption mild regularity and connectivity conditions
Invoked for model identifiability as stated in the abstract.
domain assumption random design assumptions
Required for existence of the estimator with high probability and for the non-asymptotic error bounds.

pith-pipeline@v0.9.0 · 5425 in / 1209 out tokens · 45719 ms · 2026-05-10T08:04:10.622938+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

7 extracted references · 7 canonical work pages

[1]

& Hinton, G

Agarwal, R., Melnick, L., Frosst, N., Zhang, X., Lengerich, B., Caruana, R. & Hinton, G. E. (2021), Neural additive models: Interpretable machine learning with neural nets, inNeurIPS, Vol

work page 2021
[2]

J., Klaassen, C

Bickel, P. J., Klaassen, C. A., Bickel, P. J., Ritov, Y., Klaassen, J. & Wellner, J. A. (1993), Efficient and adaptive estimation for semiparametric models, Vol. 4, Springer. Bollobás, B. (2011), Random graphs,inModern graph theory, Springer, pp. 215–252. Bradley, R. A. & Terry, M. E. (1952), Rank analysis of incomplete block designs: I. the method of pai...

work page 1993
[3]

Dong, P., Han, R., Jiang, B. & Xu, Y. (2025), Statistical ranking with dynamic covariates, J. R. Stat. Soc. Ser. B88(1), 221–238. Erdős, P. (1960), On the evolution of random graphs,Publ Math Inst Hungarian Acad Sci 5,

work page 2025
[4]

Fan, J., Hou, J. & Yu, M. (2024a), Covariate assisted entity ranking with sparse intrinsic scores,arXiv preprint arXiv:2407.08814. 24 Fan, J., Hou, J. & Yu, M. (2024b), Uncertainty quantification of mle for entity ranking with covariates,J. Mach. Learn. Res.25(358), 1–83. Fan, J., Lou, Z., Wang, W. & Yu, M. (2025), Spectral ranking inferences based on gen...

work page arXiv 2025
[5]

& Hüllermeier, E

26 Schäfer, D. & Hüllermeier, E. (2018), Dyad ranking using Plackett–Luce models based on joint feature representations,Machine Learning107(5), 903–941. Schauberger, G. & Tutz, G. (2019), BTLLasso: A common framework and software package for the inclusion and selection of covariates in Bradley–Terry models,J. Stat. Softw. 88, 1–29. Schmidt-Hieber, J. (202...

work page 2018
[6]

N., Kaiser, L

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. u. & Polosukhin, I. (2017), Attention is all you need,inNeurIPS, Vol

work page 2017
[7]

& Zhu, J

Yan, T., Li, Y., Xu, J., Yang, Y. & Zhu, J. (2025), Likelihood ratio tests in random graph models with increasing dimensions,J. Am. Stat. Assoc.120(552), 2723–2736. 27 Yan, T., Yang, Y. & Xu, J. (2012), Sparse paired comparisons in the Bradley-Terry model, Statist. Sinica22(3), 1305–1318. Zermelo, E. (1929), Die berechnung der turnier-ergebnisse als ein m...

work page 2025

[1] [1]

& Hinton, G

Agarwal, R., Melnick, L., Frosst, N., Zhang, X., Lengerich, B., Caruana, R. & Hinton, G. E. (2021), Neural additive models: Interpretable machine learning with neural nets, inNeurIPS, Vol

work page 2021

[2] [2]

J., Klaassen, C

Bickel, P. J., Klaassen, C. A., Bickel, P. J., Ritov, Y., Klaassen, J. & Wellner, J. A. (1993), Efficient and adaptive estimation for semiparametric models, Vol. 4, Springer. Bollobás, B. (2011), Random graphs,inModern graph theory, Springer, pp. 215–252. Bradley, R. A. & Terry, M. E. (1952), Rank analysis of incomplete block designs: I. the method of pai...

work page 1993

[3] [3]

Dong, P., Han, R., Jiang, B. & Xu, Y. (2025), Statistical ranking with dynamic covariates, J. R. Stat. Soc. Ser. B88(1), 221–238. Erdős, P. (1960), On the evolution of random graphs,Publ Math Inst Hungarian Acad Sci 5,

work page 2025

[4] [4]

Fan, J., Hou, J. & Yu, M. (2024a), Covariate assisted entity ranking with sparse intrinsic scores,arXiv preprint arXiv:2407.08814. 24 Fan, J., Hou, J. & Yu, M. (2024b), Uncertainty quantification of mle for entity ranking with covariates,J. Mach. Learn. Res.25(358), 1–83. Fan, J., Lou, Z., Wang, W. & Yu, M. (2025), Spectral ranking inferences based on gen...

work page arXiv 2025

[5] [5]

& Hüllermeier, E

26 Schäfer, D. & Hüllermeier, E. (2018), Dyad ranking using Plackett–Luce models based on joint feature representations,Machine Learning107(5), 903–941. Schauberger, G. & Tutz, G. (2019), BTLLasso: A common framework and software package for the inclusion and selection of covariates in Bradley–Terry models,J. Stat. Softw. 88, 1–29. Schmidt-Hieber, J. (202...

work page 2018

[6] [6]

N., Kaiser, L

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. u. & Polosukhin, I. (2017), Attention is all you need,inNeurIPS, Vol

work page 2017

[7] [7]

& Zhu, J

Yan, T., Li, Y., Xu, J., Yang, Y. & Zhu, J. (2025), Likelihood ratio tests in random graph models with increasing dimensions,J. Am. Stat. Assoc.120(552), 2723–2736. 27 Yan, T., Yang, Y. & Xu, J. (2012), Sparse paired comparisons in the Bradley-Terry model, Statist. Sinica22(3), 1305–1318. Zermelo, E. (1929), Die berechnung der turnier-ergebnisse als ein m...

work page 2025