pith. sign in

arxiv: 2605.18338 · v1 · pith:YAIKOTFAnew · submitted 2026-05-18 · 📊 stat.AP · cs.LG

Robust Player-Conditional Champion Ranking for League of Legends: Style Similarity, Mastery Priors, and Archetype-Constrained Discovery

Pith reviewed 2026-05-19 23:54 UTC · model grok-4.3

classification 📊 stat.AP cs.LG
keywords champion recommendationLeague of Legendsplayer-conditional rankingstyle similaritymastery priorsarchetype clusteringinterpretable recommendationssparse behavioral data
0
0 comments X

The pith

A framework ranks League of Legends champions for specific players by blending population strength, style similarity, mastery experience, and archetype constraints.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper formalizes champion recommendation in League of Legends as creating rankings that fit an individual player's history and preferences instead of relying on global win rates alone. It assembles four sources of information into one score: how strong each champion is in the overall population, how well the player's past choices align with typical users of that champion, how much direct or indirect practice the player has with it, and how well it fits broad playstyle groups. A reader would care because player data is usually sparse, noisy, and shifts over time, so an approach that stays broken down and checkable could give more useful suggestions than either pure popularity lists or opaque prediction models.

Core claim

The central claim is that a modular method using robust median and MAD normalization, logarithmic transforms on event counts, recency-weighted style vectors, mastery-weighted champion-pool vectors, weighted cosine similarity, rank-scaled components, and k-means++ archetype clustering can produce stable player-conditional champion rankings that decompose into interpretable parts for expected performance, fit, mastery, and archetype compatibility under sparse and non-stationary data.

What carries the argument

The decomposed recommendation score that adds together a population-strength proxy, player-style similarity via weighted cosine, direct and indirect mastery priors, and archetype-level guardrails from clustering.

If this is right

  • Recommendation scores become inspectable because each champion's total breaks down into separate contributions from strength, style fit, mastery, and archetype match.
  • The system can adapt to changes in how a player behaves over time through recency weighting and robust statistical transforms without retraining a full model.
  • New data sources or constraints can be added or removed in a modular way while keeping the output auditable and reproducible.
  • Evaluation can proceed through temporal train-test splits that test next-champion recovery and calibration without needing external win labels.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same four-source structure could be tested on recommendation tasks in other games that also have sparse per-player histories.
  • Stable archetypes from the clustering step might later serve as a way to group players for broader game-balance analysis.
  • The method could be extended to suggest full team lineups by applying the same compatibility logic across multiple roles at once.

Load-bearing premise

The assumption that standard robust normalization, logarithmic transforms, recency weighting, cosine similarity, and clustering steps can be combined into stable rankings without later manual adjustments that would hurt the claimed interpretability.

What would settle it

Apply the ranking method to a large collection of players using data up to a fixed date, then measure whether the top-ranked champions appear more often as the next champion actually played in later games than simple baselines such as global popularity or random selection.

Figures

Figures reproduced from arXiv: 2605.18338 by Min Heo, Pranav Kadiyam, Prasun Panthi.

Figure 1
Figure 1. Figure 1: FIG. 1. Prototype frontend interface. The interface exposes player lookup controls, a scrollable top- [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2. Qualitative OP.GG match-history context for the case study. The image has been cropped to reduce exposure of other [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3. Qualitative OP.GG champion mastery context. Veigar and Xerath mastery help explain why control-mage recommenda [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
read the original abstract

Champion recommendation in multiplayer online battle arena games is usually framed informally as a problem of metagame strength, personal comfort, or global win rate. We formalize champion recommendation in League of Legends as an interpretable, player-conditional ranking problem under sparse, noisy, and non-stationary behavioral data. The proposed framework combines four information sources: a population-strength proxy, player-style similarity, direct and indirect mastery priors, and archetype-level guardrails. The method uses robust median/MAD normalization, logarithmic transforms for skewed event counts, recency-weighted player style vectors, mastery-weighted champion-pool vectors, weighted cosine similarity, rank-scaled score components, and k-means++ clustering for coarse archetype support. The implemented prototype uses a Python/Pandas modeling layer, Supabase-backed storage, and a web-facing recommendation interface. Unlike black-box supervised win-prediction systems, the proposed method returns decomposed recommendation scores that can be inspected as expected-performance proxy, fit, mastery, and archetype compatibility. A single-player case study on a 100-game history for the player identifier DIVINERAINRACCON is included as an end-to-end sanity check. The manuscript is therefore a methods and systems contribution: it specifies a reproducible, modular, and auditable champion recommender and gives a validation protocol for future large-scale evaluation through temporal train-test splits, next-champion recovery, calibration analysis, and ablation studies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper formalizes champion recommendation in League of Legends as an interpretable player-conditional ranking problem under sparse, noisy, and non-stationary behavioral data. It proposes a framework that integrates four information sources—a population-strength proxy, player-style similarity, direct and indirect mastery priors, and archetype-level guardrails—via robust median/MAD normalization, logarithmic transforms, recency-weighted style vectors, mastery-weighted champion-pool vectors, weighted cosine similarity, rank-scaled scores, and k-means++ clustering. The manuscript describes a Python/Pandas prototype with Supabase storage and a web interface, presents a single-player sanity check on 100 games for identifier DIVINERAINRACCON, and positions the work as a reproducible methods contribution while deferring quantitative validation (temporal splits, next-champion recovery, calibration, ablations) to future work.

Significance. If the framework demonstrates robustness at scale, it offers a modular, auditable alternative to black-box win-prediction systems by returning inspectable decomposed scores for expected-performance proxy, fit, mastery, and archetype compatibility. The explicit reproducibility focus, use of standard robust statistics and clustering tools, and outlined validation protocol are positive features for applied statistics work in gaming analytics.

major comments (2)
  1. [Abstract and Case Study] The central robustness claim under sparse, noisy, non-stationary data is supported only by the single-player end-to-end sanity check on DIVINERAINRACCON's 100-game history. No results are reported from the proposed validation protocol (temporal train-test splits, next-champion recovery, calibration analysis, or ablations), leaving the interaction of the four information sources and stability of archetype clustering untested at scale.
  2. [Methods] The method description lists free parameters including component weights in rank-scaled scores and recency decay factor, yet no sensitivity analysis or default-setting protocol is provided; this risks undermining the claimed interpretability and robustness when the combination of recency-weighted vectors, mastery-weighted pools, weighted cosine similarity, and k-means++ is applied to new players or periods.
minor comments (2)
  1. [Implementation] The implementation section would benefit from an example of the decomposed recommendation output (e.g., a table showing the four score components for a sample champion) to illustrate how the scores are presented to users.
  2. [Archetype Clustering] Clarify whether archetype cluster count is fixed or chosen via a data-driven criterion, and report any stability checks across temporal subsets of the data.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed report. The comments highlight important aspects of scope and reproducibility that we address below. We plan targeted revisions to clarify the manuscript's positioning as a methods contribution and to strengthen guidance on implementation parameters, while preserving the deferred large-scale validation as future work.

read point-by-point responses
  1. Referee: [Abstract and Case Study] The central robustness claim under sparse, noisy, non-stationary data is supported only by the single-player end-to-end sanity check on DIVINERAINRACCON's 100-game history. No results are reported from the proposed validation protocol (temporal train-test splits, next-champion recovery, calibration analysis, or ablations), leaving the interaction of the four information sources and stability of archetype clustering untested at scale.

    Authors: We agree that the current empirical support is limited to the single-player sanity check. The manuscript is framed as a methods and systems contribution that specifies a modular framework and outlines a validation protocol for future evaluation; it does not claim to have demonstrated robustness at scale. The case study illustrates end-to-end functionality rather than serving as comprehensive evidence. In revision we will update the abstract and introduction to more explicitly separate the design choices intended to promote robustness (robust normalization, recency weighting, archetype guardrails) from the limited empirical demonstration provided. We will also expand the validation-protocol section to include concrete milestones for the deferred analyses. revision: partial

  2. Referee: [Methods] The method description lists free parameters including component weights in rank-scaled scores and recency decay factor, yet no sensitivity analysis or default-setting protocol is provided; this risks undermining the claimed interpretability and robustness when the combination of recency-weighted vectors, mastery-weighted pools, weighted cosine similarity, and k-means++ is applied to new players or periods.

    Authors: We acknowledge that explicit defaults and rationale for the free parameters would improve reproducibility. The current text describes the functional forms but leaves specific numerical choices implicit in the case study. In the revised manuscript we will add a short subsection on parameter selection that states the defaults used for the DIVINERAINRACCON example (component weights in the rank-scaled score and the recency decay factor), explains their motivation from the data characteristics, and notes how they can be adjusted. A full sensitivity study remains part of the future quantitative validation protocol, but the added guidance will allow readers to replicate the reported pipeline immediately. revision: yes

Circularity Check

0 steps flagged

No circularity: framework composes standard techniques on external data

full rationale

The manuscript presents a modular methods contribution that combines population proxies, style similarity via weighted cosine, mastery priors, and archetype clustering using median/MAD normalization, log transforms, recency weighting, and k-means++. These are established, independent statistical primitives applied to observed behavioral data rather than derived quantities. No equations reduce a claimed prediction or result to a fitted parameter or self-citation by construction. The single-player case study is explicitly labeled a sanity check with quantitative validation deferred, so no load-bearing claim collapses into its own inputs. The derivation chain remains self-contained.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The framework rests on several modeling choices for handling game telemetry that are not independently verified in the abstract.

free parameters (2)
  • component weights in rank-scaled scores
    The abstract mentions rank-scaled score components whose relative importance must be chosen or tuned.
  • recency decay factor
    Recency-weighted player style vectors require a decay parameter whose value is not specified.
axioms (2)
  • domain assumption Behavioral data in League of Legends can be usefully summarized by recency-weighted style vectors and mastery-weighted champion-pool vectors
    Invoked when constructing the similarity and mastery components.
  • domain assumption k-means++ clusters on style vectors provide meaningful archetype guardrails
    Used to constrain recommendations at the archetype level.

pith-pipeline@v0.9.0 · 5795 in / 1483 out tokens · 45155 ms · 2026-05-19T23:54:24.812425+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · 1 internal anchor

  1. [1]

    We formulate champion recommendation as a content-based, player-conditional ranking problem with decomposable utility components

  2. [2]

    We define a robust preprocessing pipeline using me- dian/MAD normalization, sign correction for nega- tive metrics, clipping, and log transforms for skewed event counts

  3. [3]

    We construct player representations from both recency-weighted match behavior and mastery- weighted champion-pool behavior

  4. [4]

    We introduce a weighted cosine fit score whose fea- ture weights depend on both population dispersion and player-specific behavioral salience

  5. [5]

    Robust Player-Conditional Champion Ranking for League of Legends: Style Similarity, Mastery Priors, and Archetype-Constrained Discovery

    Wedefinedirectmastery, directperformance, andin- direct familiarity scores, allowing the recommender to distinguish comfort picks from statistically adja- cent discovery picks. arXiv:2605.18338v1 [stat.AP] 18 May 2026 2

  6. [6]

    We add an archetype guardrail based on k-means++ clustering over broader champion feature vectors, reducing the risk of recommending superficially sim- ilar but strategically distant champions

  7. [7]

    We describe a working implementation with a Python modeling stack, Supabase data layer, and web-facing interface, and I give a single-player end- to-end case study. B. Scope of claims This is a methods and systems preprint. It does not claim that the recommender improves win rate in live play, nor does it claim superiority over baselines without large-sca...

  8. [8]

    Load player-history and population tables

  9. [9]

    Validate required columns such aschampionName, championPoints, andchampionLevel

  10. [10]

    Coerce expected features to numeric values

  11. [11]

    Applylog(1 +x)to skewed event counts

  12. [12]

    Compute robust z-scores using population distribu- tions

  13. [13]

    Reverse sign for negative metrics. 6

  14. [14]

    Compute population strength and rank-scale it

  15. [15]

    Constructu game andu pool

  16. [16]

    Compute weighted cosine fit for each candidate champion

  17. [17]

    Aggregate direct mastery and direct performance by champion

  18. [18]

    Compute indirect familiarity from similarity to mas- tered champions

  19. [19]

    Cluster candidates into archetypes and compute archetype support

  20. [20]

    ComputeW(c),F(c),M(c),G(c),T(c), andR(c)

  21. [21]

    gameName

    Return the top-K recommendations with decom- posed explanation fields. XIII. IMPLEMENT A TION ARCHITECTURE The prototype is implemented in Python using NumPy and Pandas. The key implementation modules are: •league_stats.py : role-aware champion scoring, ro- bust preprocessing, shrinkage, stability penalties, and champion aggregation. •player_recommender.p...

  22. [22]

    most-played champion

  23. [23]

    highest-mastery champion

  24. [24]

    most-recently played champion

  25. [25]

    population-strength-only ranking

  26. [26]

    ordinary cosine similarity without mastery or archetype guardrail

  27. [27]

    random champion within role or within the candi- date pool. D. Ablations The following ablations should be reported:

  28. [28]

    remove population strength

  29. [29]

    remove direct mastery

  30. [30]

    remove indirect familiarity

  31. [31]

    remove archetype guardrail

  32. [32]

    replace robust z-scores with ordinary z-scores

  33. [33]

    replace recency weighting with uniform averaging

  34. [34]

    A serious predictive claim requires the full model to im- prove ranking quality or calibration relative to these al- ternatives

    replace weighted cosine with ordinary cosine. A serious predictive claim requires the full model to im- prove ranking quality or calibration relative to these al- ternatives. XVI. DA T A, PRIV ACY, AND REPRODUCIBILITY The prototype uses two table types: a player-history ta- ble and a population champion table. The implementation supports Supabase loading ...

  35. [35]

    the recommendation source code

  36. [36]

    a small anonymized sample dataset

  37. [37]

    instructions for constructing the population table

  38. [38]

    a script that reproduces Table II

  39. [39]

    an evaluation script for Hit@K, MRR, calibration, baselines, and ablations. XVII. LIMIT A TIONS A. The win score is not a true win probability In the overall-mode prototype, the population table may not contain role-specific win-rate data. Therefore the win_score should be interpreted as an expected- performance proxy, not a calibrated probability of win-...

  40. [40]

    Paul Resnick and Hal R. Varian. Recommender systems. Communications of the ACM, 40(3):56–58, 1997

  41. [41]

    Recommender Systems Handbook

    Francesco Ricci, Lior Rokach, and Bracha Shapira, editors. Recommender Systems Handbook. Springer, 2nd edition, 2015

  42. [42]

    Cambridge University Press, 2010

    Dietmar Jannach, Markus Zanker, Alexander Felfernig, and Gerhard Friedrich.Recommender Systems: An Intro- duction. Cambridge University Press, 2010

  43. [43]

    Content-based recommender systems: State of the art and trends

    PasqualeLops, MarcodeGemmis, andGiovanniSemeraro. Content-based recommender systems: State of the art and trends. InRecommender Systems Handbook, pages 73–105. Springer, 2011

  44. [44]

    Hybrid recommender systems: Survey and experiments.User Modeling and User-Adapted Interaction, 12:331–370, 2002

    Robin Burke. Hybrid recommender systems: Survey and experiments.User Modeling and User-Adapted Interaction, 12:331–370, 2002

  45. [45]

    Ma- trix factorization techniques for recommender systems

    Yehuda Koren, Robert Bell, and Chris Volinsky. Ma- trix factorization techniques for recommender systems. Computer, 42(8):30–37, 2009

  46. [46]

    Collabora- tive filtering for implicit feedback datasets

    Yifan Hu, Yehuda Koren, and Chris Volinsky. Collabora- tive filtering for implicit feedback datasets. InProceedings of ICDM, pages 263–272, 2008

  47. [47]

    BPR: Bayesian personalized ranking from implicit feedback

    Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. BPR: Bayesian personalized ranking from implicit feedback. InProceedings of UAI, pages 452–461, 2009

  48. [48]

    Optimizing search engines using clickthrough data

    Thorsten Joachims. Optimizing search engines using clickthrough data. InProceedings of KDD, pages 133–142, 2002

  49. [49]

    Learning to rank for information re- trieval.Foundations and Trends in Information Retrieval, 3(3):225–331, 2009

    Tie-Yan Liu. Learning to rank for information re- trieval.Foundations and Trends in Information Retrieval, 3(3):225–331, 2009

  50. [50]

    Explainable recommenda- tion: A survey and new perspectives.Foundations and Trends in Information Retrieval, 14(1):1–101, 2020

    Yongfeng Zhang and Xu Chen. Explainable recommenda- tion: A survey and new perspectives.Foundations and Trends in Information Retrieval, 14(1):1–101, 2020

  51. [51]

    Why should I trust you?

    Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. “Why should I trust you?”: Explaining the predictions of any classifier. InProceedings of KDD, pages 1135–1144, 2016

  52. [52]

    Rousseeuw and Christophe Croux

    Peter J. Rousseeuw and Christophe Croux. Alternatives to the median absolute deviation.Journal of the American Statistical Association, 88(424):1273–1283, 1993

  53. [53]

    Hampel, Elvezio M

    Frank R. Hampel, Elvezio M. Ronchetti, Peter J. Rousseeuw, and Werner A. Stahel.Robust Statistics: The Approach Based on Influence Functions. Wiley, 1986

  54. [54]

    k-means++: The advantages of careful seeding

    David Arthur and Sergei Vassilvitskii. k-means++: The advantages of careful seeding. InProceedings of SODA, pages 1027–1035, 2007

  55. [55]

    Anil K. Jain. Data clustering: 50 years beyond k-means. Pattern Recognition Letters, 31(8):651–666, 2010

  56. [56]

    Riot Developer Portal and Data Dragon documentation.https://developer.riotgames.com/

    Riot Games. Riot Developer Portal and Data Dragon documentation.https://developer.riotgames.com/

  57. [57]

    Supabase Python client documentation.https: //supabase.com/docs/reference/python/

    Supabase. Supabase Python client documentation.https: //supabase.com/docs/reference/python/

  58. [58]

    Player-facing match-history and champion- mastery pages forLeague of Legends

    OP.GG. Player-facing match-history and champion- mastery pages forLeague of Legends. https://www.op. gg/