Entrywise Error Bounds for Spectral Ranking with Semi-Random Adversaries

Anuran Makur; Dongmin Lee; Japneet Singh

arxiv: 2605.23854 · v1 · pith:MCPTHF4Hnew · submitted 2026-05-22 · 💻 cs.LG · math.ST· stat.ML· stat.TH

Entrywise Error Bounds for Spectral Ranking with Semi-Random Adversaries

Dongmin Lee , Anuran Makur , Japneet Singh This is my paper

Pith reviewed 2026-05-25 04:42 UTC · model grok-4.3

classification 💻 cs.LG math.STstat.MLstat.TH

keywords spectralestimationgraphsperformanceadversaryedgeserrorsampled

0 comments

The pith

Entrywise error bounds for spectral methods on BTL ranking under semi-random adversaries depend on graph spectral properties and can be recovered asymptotically via reweighting to restore the spectral gap.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Ranking items from pairwise comparisons often uses the Bradley-Terry-Luce model, where each pair has a win probability based on hidden strengths. Spectral methods turn the comparison graph into an eigenvector problem to estimate those strengths. When an adversary makes some edges more likely to appear, the standard unweighted spectral estimator's accuracy varies with how well-connected the resulting graph is. The paper shows that reweighting the observed edges to offset the boosted probabilities can bring the error bounds back to levels seen under uniform random sampling, supported by theory and simulations.

Core claim

asymptotic performance approaching that of uniformly sampled graphs can be recovered by appropriately reweighting the observed edges to counteract the adversary and restore the spectral gap

Load-bearing premise

The reweighting procedure can be performed with knowledge or accurate estimation of which edges were boosted by the adversary so that the effective graph regains a sufficient spectral gap (abstract does not specify how the reweighting weights are obtained).

Figures

Figures reproduced from arXiv: 2605.23854 by Anuran Makur, Dongmin Lee, Japneet Singh.

**Figure 2.** Figure 2: Experiment 1 (Subfigures (a)–(c)): 3-block SBM with edge sampling probabilities between each pair of blocks given by the matrix (7). Experiment 2 (Subfigures (d)–(f)): Erdős-Rényi model with edge sampling probability 𝑝 = 2 log(𝑛)/𝑛. 𝑃 =        1 1 0 1 2 log(𝑛) 𝑛 2 log(𝑛) 𝑛 0 2 log(𝑛) 𝑛 2 log(𝑛) 𝑛        , (7) where 𝑛 ∈ {30, 45, 60, 75, 90, 105, 120, 135} is the number of items (note that we c… view at source ↗

read the original abstract

Bradley-Terry-Luce (BTL) model estimation is a well-established strategy to rank a collection of items given a dataset of pairwise comparisons. Although the theoretical performance of BTL estimation methods, such as spectral and maximum likelihood estimation, is well studied in the regime of uniformly sampled graphs, generalizing such results to a wider class of random graphs has proved challenging. In this work, we investigate the entry-wise error of spectral algorithms against a semi-random adversary that can arbitrarily boost the sampling probabilities of certain edges. We find that the performance of the unweighted spectral method is heavily dependent on the spectral properties of the generated graph. Furthermore, we show that asymptotic performance approaching that of uniformly sampled graphs can be recovered by appropriately reweighting the observed edges to counteract the adversary and restore the spectral gap. Finally, we provide numerical simulations that support our theoretical findings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Reweighting can recover uniform-graph rates for spectral ranking under semi-random edge boosts, but the procedure for selecting those weights is not described clearly enough to judge practicality.

read the letter

The one thing to know is that this paper extends entrywise error analysis for spectral ranking to a semi-random adversary that can boost some edge probabilities, and shows that reweighting the observed edges can asymptotically restore the performance of the uniform case by fixing the spectral gap. That extension and the reweighting recovery are the main new pieces relative to the uniform-graph literature they cite. The paper does a reasonable job explaining why the plain spectral method is sensitive to the resulting graph spectrum and then outlining how reweighting counters the adversary, with simulations offered as supporting evidence. The simulations are at least mentioned, which is better than nothing. The soft spot is the reweighting step itself. The abstract states that appropriate reweighting counteracts the adversary and restores the gap, but gives no algorithm, estimator, or assumption for how the weights are obtained from the data. If the construction needs an oracle that identifies the boosted edges or their factors, the result is conditional on information that is not available in the stated model. If the weights are meant to be estimated, the paper would need to bound the effect of that estimation error on the restored gap, and that analysis is not visible at the level provided. The simulations do not resolve this because no details are given on how the weights were chosen in the experiments or whether error bars were computed. This paper is for people working on theoretical guarantees for ranking and spectral methods under non-uniform or adversarial sampling. A reader focused on robust estimation in comparison data could find the bounds and the reweighting idea useful to build on. It deserves a serious referee because the problem is motivated and the claims are specific enough to check in detail. I would recommend sending it to peer review, with the main request being a clear description of the reweighting procedure and any assumptions around it.

Referee Report

1 major / 1 minor

Summary. The paper claims to derive entrywise error bounds for spectral algorithms estimating rankings under the Bradley-Terry-Luce model when the comparison graph is subject to a semi-random adversary that arbitrarily boosts sampling probabilities of selected edges. It shows that the unweighted spectral method's performance depends on the spectral properties of the realized graph, but that appropriately reweighting the observed edges restores the spectral gap and recovers asymptotic performance comparable to the uniform-sampling case; the claims are supported by theoretical analysis and numerical simulations.

Significance. If the reweighting procedure can be implemented from data alone while preserving the restored gap, the result would meaningfully extend spectral ranking guarantees beyond uniform graphs to more realistic semi-random models. The provision of entrywise (rather than only norm) bounds and the simulation support are positive features.

major comments (1)

The central claim that reweighting recovers uniform-graph performance (abstract and main theorem) requires an explicit, non-oracle procedure for selecting the per-edge weights. The manuscript must specify the estimator or algorithm used to obtain these weights from the observed comparisons alone and prove that any estimation error does not destroy the restored spectral gap; without this, the recovery result is conditional on unavailable information.

minor comments (1)

The abstract states that simulations support the findings but provides no information on the number of trials, graph sizes, or error-bar reporting; adding these details would make the empirical evidence easier to assess.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. The single major comment is addressed point-by-point below.

read point-by-point responses

Referee: The central claim that reweighting recovers uniform-graph performance (abstract and main theorem) requires an explicit, non-oracle procedure for selecting the per-edge weights. The manuscript must specify the estimator or algorithm used to obtain these weights from the observed comparisons alone and prove that any estimation error does not destroy the restored spectral gap; without this, the recovery result is conditional on unavailable information.

Authors: We agree that the current manuscript presents the reweighting construction in an oracle setting that assumes knowledge of the ideal per-edge weights restoring the spectral gap. This is a valid observation. In the revision we will add an explicit, fully data-driven procedure that estimates the required weights from the observed comparison counts and node degrees alone. We will also supply a supplementary concentration argument showing that the plug-in estimation error is of lower order than the restored spectral gap and therefore does not invalidate the entrywise error bounds with high probability. The revised main theorem will be stated for this estimated weighting. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation remains self-contained against external spectral gap assumptions

full rationale

The abstract and provided context present entrywise error bounds for spectral ranking under a semi-random adversary, with a corrective reweighting step to restore the spectral gap. No quoted equations or steps reduce a claimed prediction or bound to a fitted quantity defined from the same data by construction. The reweighting is described as an external corrective procedure rather than a self-referential fit. No self-citation chains, uniqueness theorems imported from the same authors, or ansatzes smuggled via prior work appear in the given text. The central claim relies on standard spectral properties of graphs and does not collapse to renaming or self-definition. This is the expected non-finding for a paper whose core bounds are stated in terms of independent graph-theoretic quantities.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard domain assumptions of the BTL model (pairwise probabilities determined by latent scores) and spectral graph theory (existence of a usable spectral gap after reweighting). No free parameters or invented entities are mentioned in the abstract.

axioms (2)

domain assumption Bradley-Terry-Luce model holds for the underlying comparison probabilities
Invoked implicitly as the generative model for the observed comparisons.
domain assumption The reweighted graph admits a sufficient spectral gap for the error bounds to hold
Central to the recovery claim; location is the reweighting step described in the abstract.

pith-pipeline@v0.9.0 · 5683 in / 1247 out tokens · 30067 ms · 2026-05-25T04:42:14.470379+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages

[1]

Eldridge S Adams. 2005. Bayesian analysis of linear dominance hierarchies. Animal Behaviour69, 5 (May 2005), 1191–1201

work page 2005
[2]

Julian Asilis, Xi Chen, Dutch Hansen, and Shang-Hua Teng. 2026. Semi-Random Graphs, Robust Asymmetry, and Reconstruction. InProceedings of the 17th Innova- tions in Theoretical Computer Science Conference (ITCS). Milan, Italy, 12:1–12:21

work page 2026
[3]

Linas Baltrunas, Tadas Makcinskas, and Francesco Ricci. 2010. Group recommen- dations with rank aggregation and collaborative filtering. InProceedings of the Fourth ACM Conference on Recommender Systems. Barcelona, Spain, 119––126

work page 2010
[4]

Daniel Barry and JA Hartigan. 1993. Choice models for predicting divisional winners in major league baseball.J. Amer. Statist. Assoc.88, 423 (1993), 766–774

work page 1993
[5]

Avrim Blum and Joel Spencer. 1995. Coloring random and semi-random 𝑘- colorable graphs.Journal of Algorithms19, 2 (November 1995), 204–234

work page 1995
[6]

Heejong Bong, Wanshan Li, Shamindra Shrotriya, and Alessandro Rinaldo. 2020. Nonparametric estimation in the dynamic Bradley-Terry model. InProceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (AISTATS). 3317–3326

work page 2020
[7]

Stéphane Boucheron, Gábor Lugosi, and Olivier Bousquet. 2004. Concentration Inequalities. InAdvanced Lectures on Machine Learning, Olivier Bousquet, Ulrike von Luxburg, and Gunnar Rätsch (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 208–240

work page 2004
[8]

Ralph Allan Bradley and Milton E. Terry. 1952. Rank Analysis of Incomplete Block Designs. I. The Method of Paired Comparisons.Biometrika39, 3/4 (December 1952), 324–345

work page 1952
[9]

François Caron and Arnaud Doucet. 2012. Efficient Bayesian Inference for Generalized Bradley-Terry Models.Journal of Computational and Graphical Statistics21, 1 (March 2012), 174–196

work page 2012
[10]

Manuela Cattelan, Cristiano Varin, and David Firth. 2013. Dynamic Bradley- Terry modelling of sports tournaments.Journal of the Royal Statistical Society: Series C (Applied Statistics)62, 1 (January 2013), 135–150

work page 2013
[11]

Sourav Chatterjee. 2015. Matrix Estimation by Universal Singular Value Thresh- olding.The Annals of Statistics43, 1 (February 2015), 177–214

work page 2015
[12]

Pinhan Chen, Chao Gao, and Anderson Y Zhang. 2022. Partial recovery for top-𝑘 ranking: optimality of MLE and suboptimality of the spectral method.The Annals of Statistics50, 3 (June 2022), 1618–1652

work page 2022
[13]

Yanxi Chen. 2023. Ranking from Pairwise Comparisons in General Graphs and Graphs with Locality. (April 2023). https://arxiv.org/abs/2304.06821 arXiv:2304.06821 [stat.ML]

work page arXiv 2023
[14]

Yuxin Chen, Jianqing Fan, Cong Ma, and Kaizheng Wang. 2019. Spectral Method and Regularized MLE are both Optimal for Top- 𝐾 Ranking.The Annals of Statistics47, 4 (2019), 2204–2235

work page 2019
[15]

Yuxin Chen and Changho Suh. 2015. Spectral MLE: Top- 𝑘 rank aggregation from pairwise comparisons. InProceedings of the 32nd International Conference on Machine Learning. Lille, France, 371–380

work page 2015
[16]

Yu Cheng and Rong Ge. 2018. Non-convex matrix completion against a semi- random adversary. InProceedings of the 31st Conference On Learning Theory (COLT). Stockholm, Sweden, 1362–1394

work page 2018
[17]

Gonzalez, and Ion Stoica

Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Banghua Zhu, Hao Zhang, Michael Jordan, Joseph E. Gonzalez, and Ion Stoica. 2024. Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference. InProceedings of the 41st International Conference on Machine Learning (ICML). Vienna, Austria, 8359–8388

work page 2024
[18]

Paul F Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. 2017. Deep reinforcement learning from human preferences. InPro- ceedings in the Advances of Neural Information Processing Systems 30 (NIPS). Long Beach, CA, USA, 4299–4307

work page 2017
[19]

Ronen Eldan, Miklós Z Rácz, and Tselil Schramm. 2017. Braess’s paradox for the spectral gap in random graphs and delocalization of eigenvectors.Random Structures & Algorithms50, 4 (July 2017), 584–611

work page 2017
[20]

Arpad E. Elo. 1978.The Rating of Chess Players, Past and Present. Arco, New York, NY, USA

work page 1978
[21]

Jianqing Fan, Zhipeng Lou, Weichen Wang, and Mengxin Yu. 2025. Ranking inferences based on the top choice of multiway comparisons.J. Amer. Statist. Assoc.120, 549 (March 2025), 237–250

work page 2025
[22]

Jianqing Fan, Zhipeng Lou, Weichen Wang, and Mengxin Yu. 2026. Spectral Rank- ing Inferences based on General Multiway Comparisons.Operations Research74, 1 (January–February 2026), 161–180

work page 2026
[23]

Uriel Feige and Joe Kilian. 2001. Heuristics for semirandom graph problems.J. Comput. System Sci.63, 4 (December 2001), 639–671

work page 2001
[24]

Miroslav Fiedler. 1973. Algebraic connectivity of graphs.Czechoslovak Mathe- matical Journal23, 2 (1973), 298–305

work page 1973
[25]

L. R. Ford. 1957. Solution of a Ranking Problem from Binary Comparisons.The American Mathematical Monthly64, 8 Part 2 (October 1957), 28–33

work page 1957
[26]

Chao Gao, Yandi Shen, and Anderson Y. Zhang. 2023. Uncertainty quantification in the Bradley-Terry-Luce model.Information and Inference12, 2 (June 2023), 1073–1140

work page 2023
[27]

Xing Gao and Yu Cheng. 2023. Robust matrix sensing in the semi-random model. InProceedings of the Advances in Neural Information Processing Systems 36 (NeurIPS). New Orleans, LA, USA, 63385–63408

work page 2023
[28]

John Guiver and Edward Snelson. 2009. Bayesian inference for Plackett-Luce ranking models. InProceedings of the 26th Annual International Conference on Machine Learning (ICML). Montreal, QC, Canada, 377–384

work page 2009
[29]

Julien Hendrickx, Alex Olshevsky, and Venkatesh Saligrama. 2020. Minimax Rate for Learning From Pairwise Comparisons in the BTL Model. InProceedings of the 37th Annual International Conference on Machine Learning (ICML), Vol. 119. Vienna, Austria, 4193–4202

work page 2020
[30]

Holland, Kathryn Blackmond Laskey, and Samuel Leinhardt

Paul W. Holland, Kathryn Blackmond Laskey, and Samuel Leinhardt. 1983. Sto- chastic blockmodels: First steps.Social Networks5, 2 (June 1983), 109–137

work page 1983
[31]

David R. Hunter. 2004. MM Algorithms for Generalized Bradley-Terry Models. The Annals of Statistics32, 1 (February 2004), 384–406

work page 2004
[32]

Ali Jadbabaie, Anuran Makur, and Devavrat Shah. 2020. Estimation of Skill Distri- bution from a Tournament. InProceedings of the Advances in Neural Information Processing Systems 33 (NeurIPS). Vancouver, BC, Canada, 8418–8429

work page 2020
[33]

Ali Jadbabaie, Anuran Makur, and Devavrat Shah. 2024. Estimation of skill distributions.IEEE Transactions on Information Theory70, 9 (September 2024), 6447–6480

work page 2024
[34]

Arun Jambulapati, Jerry Li, Christopher Musco, Aaron Sidford, and Kevin Tian

work page
[35]

(November 2021)

Fast and Near-Optimal Diagonal Preconditioning. (November 2021). arXiv:2008.01722v2 [math.OC]

work page arXiv 2021
[36]

Minje Jang, Sunghyun Kim, and Changho Suh. 2018. Top-𝐾 rank aggregation from 𝑀-wise comparisons.IEEE Journal of Selected Topics in Signal Processing 12, 5 (May 2018), 989–1004

work page 2018
[37]

Jonathan Kelner, Jerry Li, Allen X Liu, Aaron Sidford, and Kevin Tian. 2023. Semi- random sparse recovery in nearly-linear time. InProceedings of the Thirty Sixth Annual Conference on Learning Theory (COLT). Bangalore, India, 2352–2398

work page 2023
[38]

Levin, Yuval Peres, and Elizabeth L

David A. Levin, Yuval Peres, and Elizabeth L. Wilmer. 2009.Markov Chains and Mixing Times(first ed.). American Mathematical Society

work page 2009
[39]

Wanshan Li, Shamindra Shrotriya, and Alessandro Rinaldo. 2022. ℓ∞-Bounds of the MLE in the BTL Model under General Comparison Graphs. InProceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence. Eindhoven, The Netherlands, 1178–1187

work page 2022
[40]

Fang, and Junwei Lu

Yue Liu, Ethan X. Fang, and Junwei Lu. 2023. Lagrangian inference for ranking problems.Operations Research71, 1 (January–February 2023), 202–223

work page 2023
[41]

Matthias Löwe and Sara Terveer. 2025. Spectral properties of the stochastic block model and their application to hitting times of random walks. (April 2025). https://arxiv.org/abs/2401.07896v3 arXiv:2401.07896v3 [math.PR]

work page arXiv 2025
[42]

1959.Individual Choice Behavior: A Theoretical Analysis

Robert Duncan Luce. 1959.Individual Choice Behavior: A Theoretical Analysis. John Wiley & Sons Inc., New York, NY, USA

work page 1959
[43]

Konstantin Makarychev, Yury Makarychev, and Aravindan Vijayaraghavan. 2012. Approximation algorithms for semi-random partitioning problems. InProceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing. New York, NY, USA, 367–384

work page 2012
[44]

Anuran Makur and Japneet Singh. 2025. Hypothesis Testing for Generalized Thurstone Models. InProceedings of the 42nd International Conference on Machine Learning (ICML). Vancouver, BC, Canada, 42730–42764

work page 2025
[45]

Anuran Makur and Japneet Singh. 2025. Minimax Hypothesis Testing for the Bradley–Terry–Luce Model.IEEE Transactions on Information Theory71, 12 (December 2025), 9163–9202

work page 2025
[46]

Kenneth Massey. 1997. Statistical models applied to the rating of sports teams. (1997). Bluefield College

work page 1997
[47]

J. N. S. Matthews and K. P. Morris. 1995. An Application of Bradley-Terry-Type Models to the Measurement of Pain.Journal of the Royal Statistical Society Series C (Applied Statistics)44, 2 (1995), 243–255. 9 Lee, Makur, and Singh

work page 1995
[48]

Sahand Negahban, Sewoong Oh, and Devavrat Shah. 2017. Rank Centrality: Rank- ing from Pairwise Comparisons.Operations Research65, 1 (January–February 2017), 266–287

work page 2017
[49]

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schul- man, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F Christiano, Jan Leike, and Ryan Lowe. 2022. Training language models to follow instructions with human...

work page 2022
[50]

1999.The PageRank Citation Ranking: Bringing Order to the Web

Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999.The PageRank Citation Ranking: Bringing Order to the Web. Technical Report 1999-66. Stanford InfoLab

work page 1999
[51]

Plackett

Robin L. Plackett. 1975. The Analysis of Permutations.Journal of the Royal Statistical Society, Series C (Applied Statistics)24, 2 (1975), 193–202

work page 1975
[52]

Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kan- nan Ramchandran, and Martin J

Nihar B. Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kan- nan Ramchandran, and Martin J. Wainwright. 2016. Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence.Journal of Machine Learning Research17, 58 (2016), 1–47

work page 2016
[53]

Gordon Simons and Yi-Ching Yao. 1999. Asymptotics when the number of parameters tends to infinity in the Bradley-Terry model for paired comparisons. The Annals of Statistics27, 3 (June 1999), 1041–1060

work page 1999
[54]

Stephen M. Stigler. 1994. Citation Patterns in the Journals of Statistics and Probability.Statist. Sci.9, 1 (February 1994), 94–108

work page 1994
[55]

Joel A Tropp. 2015. An introduction to matrix concentration inequalities.Foun- dations and Trends in Machine Learning8, 1-2 (May 2015), 1–230

work page 2015
[56]

Yuepeng Yang, Antares Chen, Lorenzo Orecchia, and Cong Ma. 2024. Top- 𝐾 ranking with a monotone adversary. InProceedings of the Thirty Seventh Conference on Learning Theory (COLT), Shipra Agrawal and Aaron Roth (Eds.). Edmonton, AB, Canada, 5123–5162

work page 2024
[57]

𝑎≥𝑏with probability at least1−𝑂(𝑛 −𝑡 )

Ernst Zermelo. 1929. Die Berechnung der Turnier-Ergebnisse als ein Maxi- mumproblem der Wahrscheinlichkeitsrechnung.Mathematische Zeitschrift29, 1 (December 1929), 436–460. A Introduction to Appendices In the appendices, we prove the various theorems and propositions introduced in the body. We use the shorthand 𝑎≥ 1−𝑂(𝑛 −𝑡 ) 𝑏 to say “𝑎≥𝑏with probability ...

work page 1929

[1] [1]

Eldridge S Adams. 2005. Bayesian analysis of linear dominance hierarchies. Animal Behaviour69, 5 (May 2005), 1191–1201

work page 2005

[2] [2]

Julian Asilis, Xi Chen, Dutch Hansen, and Shang-Hua Teng. 2026. Semi-Random Graphs, Robust Asymmetry, and Reconstruction. InProceedings of the 17th Innova- tions in Theoretical Computer Science Conference (ITCS). Milan, Italy, 12:1–12:21

work page 2026

[3] [3]

Linas Baltrunas, Tadas Makcinskas, and Francesco Ricci. 2010. Group recommen- dations with rank aggregation and collaborative filtering. InProceedings of the Fourth ACM Conference on Recommender Systems. Barcelona, Spain, 119––126

work page 2010

[4] [4]

Daniel Barry and JA Hartigan. 1993. Choice models for predicting divisional winners in major league baseball.J. Amer. Statist. Assoc.88, 423 (1993), 766–774

work page 1993

[5] [5]

Avrim Blum and Joel Spencer. 1995. Coloring random and semi-random 𝑘- colorable graphs.Journal of Algorithms19, 2 (November 1995), 204–234

work page 1995

[6] [6]

Heejong Bong, Wanshan Li, Shamindra Shrotriya, and Alessandro Rinaldo. 2020. Nonparametric estimation in the dynamic Bradley-Terry model. InProceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (AISTATS). 3317–3326

work page 2020

[7] [7]

Stéphane Boucheron, Gábor Lugosi, and Olivier Bousquet. 2004. Concentration Inequalities. InAdvanced Lectures on Machine Learning, Olivier Bousquet, Ulrike von Luxburg, and Gunnar Rätsch (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 208–240

work page 2004

[8] [8]

Ralph Allan Bradley and Milton E. Terry. 1952. Rank Analysis of Incomplete Block Designs. I. The Method of Paired Comparisons.Biometrika39, 3/4 (December 1952), 324–345

work page 1952

[9] [9]

François Caron and Arnaud Doucet. 2012. Efficient Bayesian Inference for Generalized Bradley-Terry Models.Journal of Computational and Graphical Statistics21, 1 (March 2012), 174–196

work page 2012

[10] [10]

Manuela Cattelan, Cristiano Varin, and David Firth. 2013. Dynamic Bradley- Terry modelling of sports tournaments.Journal of the Royal Statistical Society: Series C (Applied Statistics)62, 1 (January 2013), 135–150

work page 2013

[11] [11]

Sourav Chatterjee. 2015. Matrix Estimation by Universal Singular Value Thresh- olding.The Annals of Statistics43, 1 (February 2015), 177–214

work page 2015

[12] [12]

Pinhan Chen, Chao Gao, and Anderson Y Zhang. 2022. Partial recovery for top-𝑘 ranking: optimality of MLE and suboptimality of the spectral method.The Annals of Statistics50, 3 (June 2022), 1618–1652

work page 2022

[13] [13]

Yanxi Chen. 2023. Ranking from Pairwise Comparisons in General Graphs and Graphs with Locality. (April 2023). https://arxiv.org/abs/2304.06821 arXiv:2304.06821 [stat.ML]

work page arXiv 2023

[14] [14]

Yuxin Chen, Jianqing Fan, Cong Ma, and Kaizheng Wang. 2019. Spectral Method and Regularized MLE are both Optimal for Top- 𝐾 Ranking.The Annals of Statistics47, 4 (2019), 2204–2235

work page 2019

[15] [15]

Yuxin Chen and Changho Suh. 2015. Spectral MLE: Top- 𝑘 rank aggregation from pairwise comparisons. InProceedings of the 32nd International Conference on Machine Learning. Lille, France, 371–380

work page 2015

[16] [16]

Yu Cheng and Rong Ge. 2018. Non-convex matrix completion against a semi- random adversary. InProceedings of the 31st Conference On Learning Theory (COLT). Stockholm, Sweden, 1362–1394

work page 2018

[17] [17]

Gonzalez, and Ion Stoica

Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Banghua Zhu, Hao Zhang, Michael Jordan, Joseph E. Gonzalez, and Ion Stoica. 2024. Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference. InProceedings of the 41st International Conference on Machine Learning (ICML). Vienna, Austria, 8359–8388

work page 2024

[18] [18]

Paul F Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. 2017. Deep reinforcement learning from human preferences. InPro- ceedings in the Advances of Neural Information Processing Systems 30 (NIPS). Long Beach, CA, USA, 4299–4307

work page 2017

[19] [19]

Ronen Eldan, Miklós Z Rácz, and Tselil Schramm. 2017. Braess’s paradox for the spectral gap in random graphs and delocalization of eigenvectors.Random Structures & Algorithms50, 4 (July 2017), 584–611

work page 2017

[20] [20]

Arpad E. Elo. 1978.The Rating of Chess Players, Past and Present. Arco, New York, NY, USA

work page 1978

[21] [21]

Jianqing Fan, Zhipeng Lou, Weichen Wang, and Mengxin Yu. 2025. Ranking inferences based on the top choice of multiway comparisons.J. Amer. Statist. Assoc.120, 549 (March 2025), 237–250

work page 2025

[22] [22]

Jianqing Fan, Zhipeng Lou, Weichen Wang, and Mengxin Yu. 2026. Spectral Rank- ing Inferences based on General Multiway Comparisons.Operations Research74, 1 (January–February 2026), 161–180

work page 2026

[23] [23]

Uriel Feige and Joe Kilian. 2001. Heuristics for semirandom graph problems.J. Comput. System Sci.63, 4 (December 2001), 639–671

work page 2001

[24] [24]

Miroslav Fiedler. 1973. Algebraic connectivity of graphs.Czechoslovak Mathe- matical Journal23, 2 (1973), 298–305

work page 1973

[25] [25]

L. R. Ford. 1957. Solution of a Ranking Problem from Binary Comparisons.The American Mathematical Monthly64, 8 Part 2 (October 1957), 28–33

work page 1957

[26] [26]

Chao Gao, Yandi Shen, and Anderson Y. Zhang. 2023. Uncertainty quantification in the Bradley-Terry-Luce model.Information and Inference12, 2 (June 2023), 1073–1140

work page 2023

[27] [27]

Xing Gao and Yu Cheng. 2023. Robust matrix sensing in the semi-random model. InProceedings of the Advances in Neural Information Processing Systems 36 (NeurIPS). New Orleans, LA, USA, 63385–63408

work page 2023

[28] [28]

John Guiver and Edward Snelson. 2009. Bayesian inference for Plackett-Luce ranking models. InProceedings of the 26th Annual International Conference on Machine Learning (ICML). Montreal, QC, Canada, 377–384

work page 2009

[29] [29]

Julien Hendrickx, Alex Olshevsky, and Venkatesh Saligrama. 2020. Minimax Rate for Learning From Pairwise Comparisons in the BTL Model. InProceedings of the 37th Annual International Conference on Machine Learning (ICML), Vol. 119. Vienna, Austria, 4193–4202

work page 2020

[30] [30]

Holland, Kathryn Blackmond Laskey, and Samuel Leinhardt

Paul W. Holland, Kathryn Blackmond Laskey, and Samuel Leinhardt. 1983. Sto- chastic blockmodels: First steps.Social Networks5, 2 (June 1983), 109–137

work page 1983

[31] [31]

David R. Hunter. 2004. MM Algorithms for Generalized Bradley-Terry Models. The Annals of Statistics32, 1 (February 2004), 384–406

work page 2004

[32] [32]

Ali Jadbabaie, Anuran Makur, and Devavrat Shah. 2020. Estimation of Skill Distri- bution from a Tournament. InProceedings of the Advances in Neural Information Processing Systems 33 (NeurIPS). Vancouver, BC, Canada, 8418–8429

work page 2020

[33] [33]

Ali Jadbabaie, Anuran Makur, and Devavrat Shah. 2024. Estimation of skill distributions.IEEE Transactions on Information Theory70, 9 (September 2024), 6447–6480

work page 2024

[34] [34]

Arun Jambulapati, Jerry Li, Christopher Musco, Aaron Sidford, and Kevin Tian

work page

[35] [35]

(November 2021)

Fast and Near-Optimal Diagonal Preconditioning. (November 2021). arXiv:2008.01722v2 [math.OC]

work page arXiv 2021

[36] [36]

Minje Jang, Sunghyun Kim, and Changho Suh. 2018. Top-𝐾 rank aggregation from 𝑀-wise comparisons.IEEE Journal of Selected Topics in Signal Processing 12, 5 (May 2018), 989–1004

work page 2018

[37] [37]

Jonathan Kelner, Jerry Li, Allen X Liu, Aaron Sidford, and Kevin Tian. 2023. Semi- random sparse recovery in nearly-linear time. InProceedings of the Thirty Sixth Annual Conference on Learning Theory (COLT). Bangalore, India, 2352–2398

work page 2023

[38] [38]

Levin, Yuval Peres, and Elizabeth L

David A. Levin, Yuval Peres, and Elizabeth L. Wilmer. 2009.Markov Chains and Mixing Times(first ed.). American Mathematical Society

work page 2009

[39] [39]

Wanshan Li, Shamindra Shrotriya, and Alessandro Rinaldo. 2022. ℓ∞-Bounds of the MLE in the BTL Model under General Comparison Graphs. InProceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence. Eindhoven, The Netherlands, 1178–1187

work page 2022

[40] [40]

Fang, and Junwei Lu

Yue Liu, Ethan X. Fang, and Junwei Lu. 2023. Lagrangian inference for ranking problems.Operations Research71, 1 (January–February 2023), 202–223

work page 2023

[41] [41]

Matthias Löwe and Sara Terveer. 2025. Spectral properties of the stochastic block model and their application to hitting times of random walks. (April 2025). https://arxiv.org/abs/2401.07896v3 arXiv:2401.07896v3 [math.PR]

work page arXiv 2025

[42] [42]

1959.Individual Choice Behavior: A Theoretical Analysis

Robert Duncan Luce. 1959.Individual Choice Behavior: A Theoretical Analysis. John Wiley & Sons Inc., New York, NY, USA

work page 1959

[43] [43]

Konstantin Makarychev, Yury Makarychev, and Aravindan Vijayaraghavan. 2012. Approximation algorithms for semi-random partitioning problems. InProceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing. New York, NY, USA, 367–384

work page 2012

[44] [44]

Anuran Makur and Japneet Singh. 2025. Hypothesis Testing for Generalized Thurstone Models. InProceedings of the 42nd International Conference on Machine Learning (ICML). Vancouver, BC, Canada, 42730–42764

work page 2025

[45] [45]

Anuran Makur and Japneet Singh. 2025. Minimax Hypothesis Testing for the Bradley–Terry–Luce Model.IEEE Transactions on Information Theory71, 12 (December 2025), 9163–9202

work page 2025

[46] [46]

Kenneth Massey. 1997. Statistical models applied to the rating of sports teams. (1997). Bluefield College

work page 1997

[47] [47]

J. N. S. Matthews and K. P. Morris. 1995. An Application of Bradley-Terry-Type Models to the Measurement of Pain.Journal of the Royal Statistical Society Series C (Applied Statistics)44, 2 (1995), 243–255. 9 Lee, Makur, and Singh

work page 1995

[48] [48]

Sahand Negahban, Sewoong Oh, and Devavrat Shah. 2017. Rank Centrality: Rank- ing from Pairwise Comparisons.Operations Research65, 1 (January–February 2017), 266–287

work page 2017

[49] [49]

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schul- man, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F Christiano, Jan Leike, and Ryan Lowe. 2022. Training language models to follow instructions with human...

work page 2022

[50] [50]

1999.The PageRank Citation Ranking: Bringing Order to the Web

Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999.The PageRank Citation Ranking: Bringing Order to the Web. Technical Report 1999-66. Stanford InfoLab

work page 1999

[51] [51]

Plackett

Robin L. Plackett. 1975. The Analysis of Permutations.Journal of the Royal Statistical Society, Series C (Applied Statistics)24, 2 (1975), 193–202

work page 1975

[52] [52]

Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kan- nan Ramchandran, and Martin J

Nihar B. Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kan- nan Ramchandran, and Martin J. Wainwright. 2016. Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence.Journal of Machine Learning Research17, 58 (2016), 1–47

work page 2016

[53] [53]

Gordon Simons and Yi-Ching Yao. 1999. Asymptotics when the number of parameters tends to infinity in the Bradley-Terry model for paired comparisons. The Annals of Statistics27, 3 (June 1999), 1041–1060

work page 1999

[54] [54]

Stephen M. Stigler. 1994. Citation Patterns in the Journals of Statistics and Probability.Statist. Sci.9, 1 (February 1994), 94–108

work page 1994

[55] [55]

Joel A Tropp. 2015. An introduction to matrix concentration inequalities.Foun- dations and Trends in Machine Learning8, 1-2 (May 2015), 1–230

work page 2015

[56] [56]

Yuepeng Yang, Antares Chen, Lorenzo Orecchia, and Cong Ma. 2024. Top- 𝐾 ranking with a monotone adversary. InProceedings of the Thirty Seventh Conference on Learning Theory (COLT), Shipra Agrawal and Aaron Roth (Eds.). Edmonton, AB, Canada, 5123–5162

work page 2024

[57] [57]

𝑎≥𝑏with probability at least1−𝑂(𝑛 −𝑡 )

Ernst Zermelo. 1929. Die Berechnung der Turnier-Ergebnisse als ein Maxi- mumproblem der Wahrscheinlichkeitsrechnung.Mathematische Zeitschrift29, 1 (December 1929), 436–460. A Introduction to Appendices In the appendices, we prove the various theorems and propositions introduced in the body. We use the shorthand 𝑎≥ 1−𝑂(𝑛 −𝑡 ) 𝑏 to say “𝑎≥𝑏with probability ...

work page 1929