Ranking matters: Does the new format select the best teams for the knockout phase in the UEFA Champions League?
Pith reviewed 2026-05-22 23:48 UTC · model grok-4.3
The pith
The official points-based ranking in the new UEFA Champions League league phase may not select the best teams.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that applying several well-known ranking methods for incomplete round robin tournaments to the 2024/25 UEFA Champions League league phase shows inconsistencies with the official ranking, making it doubtful whether the currently used point-based system provides the best ranking of the teams.
What carries the argument
Ranking methods for incomplete round robin tournaments, used to test the robustness of the official points-based ranking.
Load-bearing premise
That the alternative ranking methods applied are more appropriate or accurate than the official points system for determining advancement.
What would settle it
A direct comparison of which teams advance under the points system versus the alternative methods, checked against their actual performance in the subsequent knockout phase.
read the original abstract
Starting in the 2024/25 season, the Union of European Football Associations (UEFA) has fundamentally changed the format of its club competitions: the group stage has been replaced by a league phase played by 36 teams in an incomplete round robin format. This makes ranking the teams based on their results challenging because teams play against different sets of opponents, whose strengths vary. In this research note, we apply several well-known ranking methods for incomplete round robin tournaments to the 2024/25 UEFA Champions League league phase in order to check the robustness of the official ranking, as well as to call the attention of organizers to the non-trivial issue of ranking in these competitions. Our results show that it is doubtful whether the currently used point-based system provides the best ranking of the teams.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript applies several well-known ranking methods for incomplete round-robin tournaments (e.g., Massey, Colley, and eigenvector-based approaches) to the 2024/25 UEFA Champions League league phase. It compares the resulting team orderings against the official points-based ranking and concludes that discrepancies indicate it is doubtful whether the points system provides the best ranking for selecting teams for the knockout phase.
Significance. The work usefully illustrates the sensitivity of rankings to method choice in unbalanced schedules with heterogeneous opponents, a timely issue given the new UCL format. The use of real 2024/25 data provides a concrete case study. If an independent validation criterion were supplied showing that alternatives outperform points on a measurable objective (e.g., predictive power), the findings could inform tournament design; absent that, the significance for questioning the official system remains limited.
major comments (2)
- [Abstract] Abstract: The central claim that 'it is doubtful whether the currently used point-based system provides the best ranking of the teams' is not supported by any independent criterion for 'best.' The manuscript reports rank differences across methods but supplies no validation (such as out-of-sample prediction of knockout outcomes, recovery of latent strengths, or alignment with a fairness axiom) demonstrating that the alternatives are superior to points under its own design goals.
- [Abstract] The abstract states a conclusion but provides no details on data handling, exact implementations of the ranking methods, statistical tests for rank differences, or robustness checks; this prevents verification that observed discrepancies are meaningful rather than artifacts of arbitrary choices or incomplete data.
minor comments (2)
- Consider adding a table or figure explicitly listing the top 8-10 teams under each method alongside their official points positions and any qualification changes.
- Clarify in the methods whether the alternative rankings were computed on the full observed results or adjusted for schedule imbalance beyond the standard formulations.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We respond point-by-point to the major comments below.
read point-by-point responses
-
Referee: [Abstract] The central claim that 'it is doubtful whether the currently used point-based system provides the best ranking of the teams' is not supported by any independent criterion for 'best.' The manuscript reports rank differences across methods but supplies no validation (such as out-of-sample prediction of knockout outcomes, recovery of latent strengths, or alignment with a fairness axiom) demonstrating that the alternatives are superior to points under its own design goals.
Authors: The manuscript is a short research note whose goal is to apply established ranking methods for incomplete round-robin tournaments to the 2024/25 UCL league-phase data and to demonstrate that the official points ranking differs from orderings produced by Massey, Colley, and eigenvector approaches. These discrepancies indicate that the points system is not robust to reasonable methodological alternatives, thereby raising doubt about whether it yields the 'best' ranking when no consensus definition of 'best' exists for unbalanced schedules. The note does not claim or attempt to demonstrate superiority of any alternative via an external validation criterion, as that would require a different study; its purpose is to highlight the non-trivial ranking problem created by the new format. We are prepared to revise the abstract to clarify this scope and moderate the concluding language. revision: partial
-
Referee: [Abstract] The abstract states a conclusion but provides no details on data handling, exact implementations of the ranking methods, statistical tests for rank differences, or robustness checks; this prevents verification that observed discrepancies are meaningful rather than artifacts of arbitrary choices or incomplete data.
Authors: The abstract is deliberately concise, as is conventional for a research note. The main text supplies the data source (official UEFA match results), the precise formulations and references for the Massey, Colley, and eigenvector methods, and the direct rank comparisons. We will revise the abstract to include a brief statement on the data and methods employed. Additional statistical tests or robustness checks can be added to the body of the paper during revision if the editor deems them necessary. revision: yes
Circularity Check
No circularity: empirical comparison of independent ranking methods on external data
full rationale
The paper applies several established, well-known ranking methods (such as those for incomplete round-robin tournaments) directly to the observed 2024/25 UCL league phase match results and compares the resulting rankings to the official points system. There are no equations, fitted parameters, predictions derived from subsets of the data, self-citations invoked as uniqueness theorems, or ansatzes smuggled in. The central claim of doubt regarding the points system rests on observed discrepancies between independent methods and the official ranking, which is a standard empirical sensitivity check rather than any reduction by construction. The derivation chain is self-contained against external benchmarks (real match outcomes) with no load-bearing self-referential steps.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
P., Kreutzer, E., Langville, A
Chartier, T. P., Kreutzer, E., Langville, A. N., and Pedings, K. E. (2011). Sensitivity and stability of ranking vectors. SIAM Journal on Scientific Computing , 33(3):1077–1102. 8
work page 2011
-
[2]
Chebotarev, P. Yu. (1994). Aggregation of preferences by the generalized row sum method. Mathematical Social Sciences, 27(3):293–320
work page 1994
-
[3]
Colley, W. (2002). Colley’s bias free college football ranking method.Princeton, NJ, USA: Princeton University . Csató, L. (2017). On the ranking of a Swiss system chess team tournament. Annals of Operations Research , 254(1- 2):17–36. Csató, L. (2021). Coronavirus and sports leagues: obtaining a fair ranking when the season cannot resume. IMA Journal of ...
work page 2002
-
[4]
Dabadghao, S. S. and Vaziri, B. (2022). The predictive power of popular sports ranking methods in the NFL, NBA, and NHL. Operational Research, 22(3):2767–2783
work page 2022
-
[5]
DeHollander, A. and Karwan, M. (2025). Improving strength of schedule metrics in sports scheduling. Journal of Quantitative Analysis in Sports, in press. DOI: 10.1515/jqas-2024-0171
-
[6]
Devlin, S. and Treloar, T. (2018). A network diffusion ranking family that includes the methods of Markov, Massey, and Colley. Journal of Quantitative Analysis in Sports , 14(3):91–101
work page 2018
-
[7]
Fearnhead, P. and Taylor, B. M. (2010). Calculating strength of schedule, and choosing teams for March Madness. The American Statistician, 64(2):108–115. FIFA (2018). Revision of the FIFA / Coca-Cola World Ranking. https://digitalhub.fifa.com/m/ f99da4f73212220/original/edbm045h0udbwkqew35a-pdf.pdf
work page 2010
-
[8]
Freixas, J. (2022). The decline of the Buchholz tiebreaker system: A preferable alternative. In Nguyen, N. T., Kowal- czyk, R., Mercik, J., and Motylska-Kuźma, A., editors,Transactions on Computational Collective Intelligence XXXVII, pages 1–20. Springer, Berlin, Heidelberg, Germany. González-Díaz, J., Hendrickx, R., and Lohmann, E. (2014). Paired compari...
work page 2022
-
[9]
Keener, J. P. (1993). The Perron–Frobenius theorem and the ranking of football teams. SIAM Review, 35(1):80–93
work page 1993
-
[10]
Lambers, R. and Spieksma, F. C. R. (2020). True rankings. Manuscript. https://www.euro-online.org/ websites/orinsports/wp-content/uploads/sites/10/2020/05/TrueRanking.pdf
work page 2020
-
[11]
Landau, E. (1895). Zur relativen Wertbemessung der Turnierresultate.Deutsches Wochenschach, 11:366–369. https: //books.google.nl/books?id=rDr8AmfYCFkC&pg=PA366. Lapré, M. A. and Palazzolo, E. M. (2022). Quantifying the impact of imbalanced groups in FIFA Women’s World Cup tournaments 1991–2019. Journal of Quantitative Analysis in Sports , 18(3):187–199
work page 2022
-
[12]
Lasek, J., Szlávik, Z., and Bhulai, S. (2013). The predictive power of ranking systems in association football. Interna- tional Journal of Applied Pattern Recognition , 1(1):27–46. Leiva Bertrán, F. (2025). Ranking in incomplete tournaments: The generalized win percentage method, efficiency, and NCAA football. Journal of Sports Economics , 26(1):3–34
work page 2013
-
[13]
Sinn, R. and Ziegler, G. M. (2022). Landau on chess tournaments and Google’s PageRank. Manuscript. DOI: 10.48550/arXiv.2210.17300
-
[14]
Stefani, R. T. (1977). Football and basketball predictions using least squares. IEEE Transactions on Systems, Man, and Cybernetics, 7(2):117–121
work page 1977
-
[15]
Stefani, R. T. (1980). Improved least squares football, basketball, and soccer predictions.IEEE Transactions on Systems, Man, and Cybernetics, 10(2):116–123
work page 1980
-
[16]
Vaziri, B., Dabadghao, S., Yih, Y., and Morin, T. L. (2018). Properties of sports ranking methods. Journal of the Operational Research Society, 69(5):776–787. 9 Appendix Table A.1: Rankings with the Generalized Row Sum method in the 2024/25 UEFA Champions League league phase Parameter (ε) Ranking 0 0.01 0.1 0.25 0.5 0.75 1 2 5 10 100 ∞ Liverpool 1 1 1 1 1...
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.