pith. sign in

arxiv: 2503.13569 · v2 · submitted 2025-03-17 · ⚛️ physics.soc-ph · math.OC· stat.AP

Ranking matters: Does the new format select the best teams for the knockout phase in the UEFA Champions League?

Pith reviewed 2026-05-22 23:48 UTC · model grok-4.3

classification ⚛️ physics.soc-ph math.OCstat.AP
keywords UEFA Champions Leagueleague phaseranking methodsincomplete round robinpoints systemtournament rankingsports competition
0
0 comments X

The pith

The official points-based ranking in the new UEFA Champions League league phase may not select the best teams.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines the new league phase format of the UEFA Champions League, where 36 teams compete in an incomplete round robin. Because teams face different sets of opponents with varying strengths, direct comparison by points is complicated. Several established ranking methods for such tournaments are applied to the 2024/25 season data. These methods produce orderings that differ from the official standings. The results indicate that the points system may not reliably identify the strongest teams for the knockout phase.

Core claim

The paper claims that applying several well-known ranking methods for incomplete round robin tournaments to the 2024/25 UEFA Champions League league phase shows inconsistencies with the official ranking, making it doubtful whether the currently used point-based system provides the best ranking of the teams.

What carries the argument

Ranking methods for incomplete round robin tournaments, used to test the robustness of the official points-based ranking.

Load-bearing premise

That the alternative ranking methods applied are more appropriate or accurate than the official points system for determining advancement.

What would settle it

A direct comparison of which teams advance under the points system versus the alternative methods, checked against their actual performance in the subsequent knockout phase.

read the original abstract

Starting in the 2024/25 season, the Union of European Football Associations (UEFA) has fundamentally changed the format of its club competitions: the group stage has been replaced by a league phase played by 36 teams in an incomplete round robin format. This makes ranking the teams based on their results challenging because teams play against different sets of opponents, whose strengths vary. In this research note, we apply several well-known ranking methods for incomplete round robin tournaments to the 2024/25 UEFA Champions League league phase in order to check the robustness of the official ranking, as well as to call the attention of organizers to the non-trivial issue of ranking in these competitions. Our results show that it is doubtful whether the currently used point-based system provides the best ranking of the teams.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript applies several well-known ranking methods for incomplete round-robin tournaments (e.g., Massey, Colley, and eigenvector-based approaches) to the 2024/25 UEFA Champions League league phase. It compares the resulting team orderings against the official points-based ranking and concludes that discrepancies indicate it is doubtful whether the points system provides the best ranking for selecting teams for the knockout phase.

Significance. The work usefully illustrates the sensitivity of rankings to method choice in unbalanced schedules with heterogeneous opponents, a timely issue given the new UCL format. The use of real 2024/25 data provides a concrete case study. If an independent validation criterion were supplied showing that alternatives outperform points on a measurable objective (e.g., predictive power), the findings could inform tournament design; absent that, the significance for questioning the official system remains limited.

major comments (2)
  1. [Abstract] Abstract: The central claim that 'it is doubtful whether the currently used point-based system provides the best ranking of the teams' is not supported by any independent criterion for 'best.' The manuscript reports rank differences across methods but supplies no validation (such as out-of-sample prediction of knockout outcomes, recovery of latent strengths, or alignment with a fairness axiom) demonstrating that the alternatives are superior to points under its own design goals.
  2. [Abstract] The abstract states a conclusion but provides no details on data handling, exact implementations of the ranking methods, statistical tests for rank differences, or robustness checks; this prevents verification that observed discrepancies are meaningful rather than artifacts of arbitrary choices or incomplete data.
minor comments (2)
  1. Consider adding a table or figure explicitly listing the top 8-10 teams under each method alongside their official points positions and any qualification changes.
  2. Clarify in the methods whether the alternative rankings were computed on the full observed results or adjusted for schedule imbalance beyond the standard formulations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We respond point-by-point to the major comments below.

read point-by-point responses
  1. Referee: [Abstract] The central claim that 'it is doubtful whether the currently used point-based system provides the best ranking of the teams' is not supported by any independent criterion for 'best.' The manuscript reports rank differences across methods but supplies no validation (such as out-of-sample prediction of knockout outcomes, recovery of latent strengths, or alignment with a fairness axiom) demonstrating that the alternatives are superior to points under its own design goals.

    Authors: The manuscript is a short research note whose goal is to apply established ranking methods for incomplete round-robin tournaments to the 2024/25 UCL league-phase data and to demonstrate that the official points ranking differs from orderings produced by Massey, Colley, and eigenvector approaches. These discrepancies indicate that the points system is not robust to reasonable methodological alternatives, thereby raising doubt about whether it yields the 'best' ranking when no consensus definition of 'best' exists for unbalanced schedules. The note does not claim or attempt to demonstrate superiority of any alternative via an external validation criterion, as that would require a different study; its purpose is to highlight the non-trivial ranking problem created by the new format. We are prepared to revise the abstract to clarify this scope and moderate the concluding language. revision: partial

  2. Referee: [Abstract] The abstract states a conclusion but provides no details on data handling, exact implementations of the ranking methods, statistical tests for rank differences, or robustness checks; this prevents verification that observed discrepancies are meaningful rather than artifacts of arbitrary choices or incomplete data.

    Authors: The abstract is deliberately concise, as is conventional for a research note. The main text supplies the data source (official UEFA match results), the precise formulations and references for the Massey, Colley, and eigenvector methods, and the direct rank comparisons. We will revise the abstract to include a brief statement on the data and methods employed. Additional statistical tests or robustness checks can be added to the body of the paper during revision if the editor deems them necessary. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical comparison of independent ranking methods on external data

full rationale

The paper applies several established, well-known ranking methods (such as those for incomplete round-robin tournaments) directly to the observed 2024/25 UCL league phase match results and compares the resulting rankings to the official points system. There are no equations, fitted parameters, predictions derived from subsets of the data, self-citations invoked as uniqueness theorems, or ansatzes smuggled in. The central claim of doubt regarding the points system rests on observed discrepancies between independent methods and the official ranking, which is a standard empirical sensitivity check rather than any reduction by construction. The derivation chain is self-contained against external benchmarks (real match outcomes) with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only abstract available; no information on free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5701 in / 879 out tokens · 32253 ms · 2026-05-22T23:48:47.921841+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

  1. [1]

    P., Kreutzer, E., Langville, A

    Chartier, T. P., Kreutzer, E., Langville, A. N., and Pedings, K. E. (2011). Sensitivity and stability of ranking vectors. SIAM Journal on Scientific Computing , 33(3):1077–1102. 8

  2. [2]

    Chebotarev, P. Yu. (1994). Aggregation of preferences by the generalized row sum method. Mathematical Social Sciences, 27(3):293–320

  3. [3]

    Colley, W. (2002). Colley’s bias free college football ranking method.Princeton, NJ, USA: Princeton University . Csató, L. (2017). On the ranking of a Swiss system chess team tournament. Annals of Operations Research , 254(1- 2):17–36. Csató, L. (2021). Coronavirus and sports leagues: obtaining a fair ranking when the season cannot resume. IMA Journal of ...

  4. [4]

    Dabadghao, S. S. and Vaziri, B. (2022). The predictive power of popular sports ranking methods in the NFL, NBA, and NHL. Operational Research, 22(3):2767–2783

  5. [5]

    and Karwan, M

    DeHollander, A. and Karwan, M. (2025). Improving strength of schedule metrics in sports scheduling. Journal of Quantitative Analysis in Sports, in press. DOI: 10.1515/jqas-2024-0171

  6. [6]

    and Treloar, T

    Devlin, S. and Treloar, T. (2018). A network diffusion ranking family that includes the methods of Markov, Massey, and Colley. Journal of Quantitative Analysis in Sports , 14(3):91–101

  7. [7]

    and Taylor, B

    Fearnhead, P. and Taylor, B. M. (2010). Calculating strength of schedule, and choosing teams for March Madness. The American Statistician, 64(2):108–115. FIFA (2018). Revision of the FIFA / Coca-Cola World Ranking. https://digitalhub.fifa.com/m/ f99da4f73212220/original/edbm045h0udbwkqew35a-pdf.pdf

  8. [8]

    Freixas, J. (2022). The decline of the Buchholz tiebreaker system: A preferable alternative. In Nguyen, N. T., Kowal- czyk, R., Mercik, J., and Motylska-Kuźma, A., editors,Transactions on Computational Collective Intelligence XXXVII, pages 1–20. Springer, Berlin, Heidelberg, Germany. González-Díaz, J., Hendrickx, R., and Lohmann, E. (2014). Paired compari...

  9. [9]

    Keener, J. P. (1993). The Perron–Frobenius theorem and the ranking of football teams. SIAM Review, 35(1):80–93

  10. [10]

    and Spieksma, F

    Lambers, R. and Spieksma, F. C. R. (2020). True rankings. Manuscript. https://www.euro-online.org/ websites/orinsports/wp-content/uploads/sites/10/2020/05/TrueRanking.pdf

  11. [11]

    Landau, E. (1895). Zur relativen Wertbemessung der Turnierresultate.Deutsches Wochenschach, 11:366–369. https: //books.google.nl/books?id=rDr8AmfYCFkC&pg=PA366. Lapré, M. A. and Palazzolo, E. M. (2022). Quantifying the impact of imbalanced groups in FIFA Women’s World Cup tournaments 1991–2019. Journal of Quantitative Analysis in Sports , 18(3):187–199

  12. [12]

    Lasek, J., Szlávik, Z., and Bhulai, S. (2013). The predictive power of ranking systems in association football. Interna- tional Journal of Applied Pattern Recognition , 1(1):27–46. Leiva Bertrán, F. (2025). Ranking in incomplete tournaments: The generalized win percentage method, efficiency, and NCAA football. Journal of Sports Economics , 26(1):3–34

  13. [13]

    and Ziegler, G

    Sinn, R. and Ziegler, G. M. (2022). Landau on chess tournaments and Google’s PageRank. Manuscript. DOI: 10.48550/arXiv.2210.17300

  14. [14]

    Stefani, R. T. (1977). Football and basketball predictions using least squares. IEEE Transactions on Systems, Man, and Cybernetics, 7(2):117–121

  15. [15]

    Stefani, R. T. (1980). Improved least squares football, basketball, and soccer predictions.IEEE Transactions on Systems, Man, and Cybernetics, 10(2):116–123

  16. [16]

    Vaziri, B., Dabadghao, S., Yih, Y., and Morin, T. L. (2018). Properties of sports ranking methods. Journal of the Operational Research Society, 69(5):776–787. 9 Appendix Table A.1: Rankings with the Generalized Row Sum method in the 2024/25 UEFA Champions League league phase Parameter (ε) Ranking 0 0.01 0.1 0.25 0.5 0.75 1 2 5 10 100 ∞ Liverpool 1 1 1 1 1...