When to Switch, Not Just What: Transition Quality Prediction in Clash Royale

Heeyun Heo; Huy Kang Kim

arxiv: 2605.21868 · v1 · pith:KDQTTPLInew · submitted 2026-05-21 · 💻 cs.LG

When to Switch, Not Just What: Transition Quality Prediction in Clash Royale

Heeyun Heo , Huy Kang Kim This is my paper

Pith reviewed 2026-05-22 07:57 UTC · model grok-4.3

classification 💻 cs.LG

keywords strategy recommendationClash Royaletransition qualityswitching behaviorwin rateplayer personasgame recommendation systems

0 comments

The pith

Recommending when to switch strategies, not just which ones, improves outcomes in competitive games like Clash Royale.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Analysis of hundreds of thousands of matches shows that players who switch strategies more often tend to have lower win rates overall. Prior recommendation approaches assume that suggesting a better strategy is always worthwhile, ignoring the effort and risk of changing mid-stream or that some players thrive on sticking to one approach. The proposed TQP system adds gates to decide which players should even consider switching and at what moments a change would likely pay off, using matched historical cases to estimate the true gain. This matters because it targets help toward the players currently suffering most from poor switching habits. The result is a measurable lift in a new evaluation that rewards only recommendations that actually distinguish good switches from bad ones.

Core claim

By treating strategy changes as transitions whose quality can be predicted separately from the base quality of the target strategy, and by conditioning on player subtype and current state, the model identifies switches that deliver net benefits beyond what would occur from staying or from random recovery after losses.

What carries the argument

TQP pipeline structured as Who (PersonaGate), When (TimingGate with matched baseline), What (ScoreFusion combining adoptability and predicted delta win rate), together with the SwitchGap metric for evaluation.

If this is right

Recommendations are withheld from players whose data shows consistency leads to better results.
Switching is recommended only when the predicted transition quality exceeds the matched baseline for staying.
The lowest-performing switchers see the largest gains from this conditional advice.
SwitchGap provides a way to score policies even when player behavior is not optimal.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

These ideas on transition costs could transfer to non-game settings like career changes or product switches where timing affects net value.
Developers might incorporate similar filters into in-game coaching tools to avoid over-advising changes.
Further work could test if providing the timing signal changes actual player switching rates in real time.

Load-bearing premise

The subtype- and state-matched baseline accurately measures the win rate that would have been achieved by not switching rather than mixing in other effects.

What would settle it

A randomized trial assigning players to receive switch recommendations at predicted good times versus at arbitrary times, then comparing subsequent win rates between the groups.

Figures

Figures reproduced from arXiv: 2605.21868 by Heeyun Heo, Huy Kang Kim.

**Figure 1.** Figure 1: Win rate by user subtype and strategy state. State 12 exhibits abnormally low win rates across all subtypes, [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: Switch rate after loss vs. after win by subtype. Subtype 1 exhibits post-loss transition rates that are [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

read the original abstract

In competitive games, players frequently switch strategies after losing streaks, yet our analysis of 926,334 match records from 34,619 Clash Royale players reveals a counterintuitive pattern: switching frequency is inversely associated with the win rate, with effects that vary substantially across players and situational contexts. We attribute this to a limitation common in many prior recommendation systems, which evaluate strategies by expected quality while overlooking the behavioral cost of switching and individual differences in switching propensity. We refer to this implicit premise as the Zero Switching Cost Assumption. To address this, we reformulate strategy recommendation as a transition-level decision problem and instantiate it as TQP (Transition Quality Predictor), a three-stage pipeline structured as Who -> When -> What. PersonaGate suppresses recommendations for players whose strategic consistency is empirically associated with superior outcomes. TimingGate identifies moments when switching is likely to yield a net benefit over staying, using a subtype- and state-matched baseline to control for natural win-rate recovery. ScoreFusion ranks candidate strategies by combining an adoptability signal with predicted transition quality (delta WR). We further introduce SwitchGap, an evaluation metric that measures a policy's discriminative quality without treating observed player choices as optimal ground truth. This property is particularly important because the most frequent switchers record the lowest win rates. The full pipeline achieves a SwitchGap of +10.4 percentage points at a recommendation rate of 5.4%, and loss-triggered switchers, despite being the lowest-performing group, benefit the most from subtype-conditioned guidance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reframes game strategy recs as a timing problem with a matched-baseline pipeline and a SwitchGap metric that avoids treating observed switches as optimal, but the reported gains rest on whether the observational matching really isolates switch benefits.

read the letter

The main point is that this work moves past just ranking the best strategy and instead builds a Who-When-What pipeline to decide when a switch is worth the cost in Clash Royale. They back it with 926k matches from 34k players and show that frequent switchers have the lowest win rates, which makes sense for why a zero-cost assumption breaks down in practice. The SwitchGap metric is a clean addition because it scores policies on how well they separate good transitions without using player choices as ground truth. That lets them report a +10.4 pp lift at a 5.4% recommendation rate and note that the weakest players gain most from the subtype guidance. The large scale and the concrete numbers give the empirical side some credibility. The framing also directly tackles individual differences in switching propensity, which standard expected-quality recommenders skip. On the downside, the TimingGate result depends on a subtype- and state-matched baseline to separate switch gains from ordinary recovery. In observational match data, unmeasured factors such as tilt, deck fatigue, or finer card-cycle state could still correlate with both the switch decision and the outcome, pushing the delta-WR upward. The abstract claims the baseline controls for natural recovery, but without the exact matching variables or robustness checks it is hard to judge how much residual bias remains, especially for the loss-triggered group. Training and validation details are also missing from the summary, so the performance numbers could partly reflect in-sample fitting. This is aimed at researchers working on sequential recommendations or behavioral analytics in competitive games. A reader who cares about cost-aware decision policies would find the pipeline structure and the evaluation metric worth looking at. It has enough novelty and data grounding to merit peer review, though referees will need to see the full methods and any sensitivity tests on the matching step.

Referee Report

3 major / 2 minor

Summary. The manuscript analyzes 926,334 match records from 34,619 Clash Royale players and reports that switching frequency is inversely associated with win rate. It attributes this pattern to the Zero Switching Cost Assumption in prior recommendation systems and proposes the TQP three-stage pipeline (PersonaGate, TimingGate, ScoreFusion) that reformulates strategy recommendation as a transition-level decision. TimingGate employs a subtype- and state-matched baseline to identify beneficial switch moments, while the full pipeline is evaluated with the new SwitchGap metric, yielding +10.4 percentage points at a 5.4% recommendation rate; loss-triggered switchers are reported to benefit most from the subtype-conditioned guidance.

Significance. If the results hold after addressing methodological details, the work is significant for game AI and recommender systems by explicitly modeling switching costs and individual differences rather than assuming observed choices are optimal. The SwitchGap metric is a useful contribution because it evaluates discriminative quality without treating frequent switchers (who have the lowest win rates) as ground truth. The large-scale observational dataset and concrete performance numbers strengthen the empirical case, and the finding that the lowest-performing subgroup benefits most is a falsifiable prediction worth testing in follow-up work.

major comments (3)

[Abstract / TimingGate] Abstract and TimingGate description: the claim that the subtype- and state-matched baseline 'controls for natural win-rate recovery' is load-bearing for the +10.4 pp SwitchGap and the differential benefit for loss-triggered switchers, yet no details are given on the exact matching variables, balance diagnostics, or sensitivity checks for residual confounders such as player-specific momentum, exact card cycle state, or psychological tilt; any imbalance would upward-bias the delta-WR estimate for switchers.
[Methods / Evaluation] Pipeline and evaluation sections: no information is supplied on model training details, validation procedures, hyperparameter selection, or data splits for the fitted models underlying TQP and SwitchGap; without these, it is impossible to determine whether the reported gains reflect genuine out-of-sample predictive power or in-sample fitting, directly affecting the circularity of the performance claims.
[Results] Results on subgroup benefits: the statement that loss-triggered switchers (lowest-performing group) benefit most from subtype-conditioned guidance rests on the TimingGate attribution, but lacks explicit reporting of subgroup sizes, confidence intervals, or alternative explanations such as regression to the mean or selection effects in who experiences loss streaks.

minor comments (2)

[Abstract] The abstract introduces SwitchGap without a one-sentence definition; adding a brief parenthetical description would improve accessibility for readers outside the subfield.
[Pipeline overview] Notation for delta WR and recommendation rate should be defined consistently on first use to avoid ambiguity in the pipeline description.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments identify important areas where additional methodological transparency will strengthen the paper. We respond to each major comment below and indicate the revisions planned for the next version.

read point-by-point responses

Referee: [Abstract / TimingGate] Abstract and TimingGate description: the claim that the subtype- and state-matched baseline 'controls for natural win-rate recovery' is load-bearing for the +10.4 pp SwitchGap and the differential benefit for loss-triggered switchers, yet no details are given on the exact matching variables, balance diagnostics, or sensitivity checks for residual confounders such as player-specific momentum, exact card cycle state, or psychological tilt; any imbalance would upward-bias the delta-WR estimate for switchers.

Authors: We agree that the current description is insufficiently detailed. In the revised manuscript we will expand the TimingGate subsection to list the precise matching variables (player subtype derived from historical playstyle clustering, current card-cycle position, elixir differential, and streak length). We will also report balance diagnostics (standardized mean differences pre- and post-matching) and include a sensitivity analysis that varies the matching tolerance and adds proxies for momentum and tilt. These additions will allow readers to assess residual confounding directly. revision: yes
Referee: [Methods / Evaluation] Pipeline and evaluation sections: no information is supplied on model training details, validation procedures, hyperparameter selection, or data splits for the fitted models underlying TQP and SwitchGap; without these, it is impossible to determine whether the reported gains reflect genuine out-of-sample predictive power or in-sample fitting, directly affecting the circularity of the performance claims.

Authors: We acknowledge the omission. The revised Methods section will contain a new experimental-setup subsection that specifies the player-level 80/20 train-test split (stratified by player ID to avoid leakage), 5-fold cross-validation for hyperparameter tuning, the grid-search ranges used, and the exact algorithms (gradient-boosted trees for transition-quality prediction and logistic regression for PersonaGate). We will also state that SwitchGap is computed exclusively on the held-out test set. revision: yes
Referee: [Results] Results on subgroup benefits: the statement that loss-triggered switchers (lowest-performing group) benefit most from subtype-conditioned guidance rests on the TimingGate attribution, but lacks explicit reporting of subgroup sizes, confidence intervals, or alternative explanations such as regression to the mean or selection effects in who experiences loss streaks.

Authors: We will add the requested statistics: subgroup sizes (number of players and matches per category), bootstrap confidence intervals for the reported deltas, and a short discussion of regression-to-the-mean and selection effects. While the subtype-matched baseline already mitigates some of these concerns, we will present the alternative explanations transparently so readers can judge their plausibility. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation chain is self-contained

full rationale

The abstract and provided text describe an empirical pipeline (PersonaGate, TimingGate with subtype/state-matched baseline, ScoreFusion) evaluated via the independently defined SwitchGap metric on 926,334 observational match records. No equations, self-citations, uniqueness theorems, or ansatzes are shown that would reduce any claimed prediction or result to its own inputs by construction. The reported +10.4 pp SwitchGap and differential benefits are presented as outcomes of the pipeline rather than tautological redefinitions or in-sample fits renamed as forecasts. The derivation therefore remains non-circular and externally falsifiable against the held-out match data.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides insufficient technical detail to identify specific free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5800 in / 1069 out tokens · 42349 ms · 2026-05-22T07:57:40.347198+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

[1]

M. Wu, J. S. Lee, and C. Steinkuehler. Understanding tilt in esports: A study on young league of legends players. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI ’21), pages 1–9, 2021

work page 2021
[2]

D. Deng, R. Trepanowski, M. Li, Y . Zhang, M. Buji ´c, and J. Hamari. Streaks and coping: Decoding player performance in league of legends using big data from top players’ matches. InCompanion Proceedings of the Annual Symposium on Computer-Human Interaction in Play (CHI PLAY Companion ’24), pages 50–55, Tampere, Finland, 2024

work page 2024
[3]

T. D. Do, S. I. Wang, D. S. Yu, M. G. McMillian, and R. P. McMahan. Using machine learning to predict game outcomes based on player-champion experience in league of legends. InProceedings of the 16th International Conference on the Foundations of Digital Games (FDG ’21), pages 1–5, Montreal, QC, Canada, 2021

work page 2021
[4]

Kou and X

Y . Kou and X. Gui. Emotion regulation in esports gaming: A qualitative study of league of legends.Proceedings of the ACM on Human-Computer Interaction, 4(CSCW2):1–25, 2020

work page 2020
[5]

Y . Kou, Y . Li, X. Gui, and E. Suzuki-Gill. Playing with streakiness in online games: How players perceive and react to winning and losing streaks in league of legends. InProceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI ’18), pages 1–14, 2018

work page 2018
[6]

Keinan and O

G. Keinan and O. Ben-Porat. Modeling churn in recommender systems with aggregated preferences.arXiv, 2025. arXiv:2502.18483

work page arXiv 2025
[7]

Ingram, B

B. Ingram, B. Rosman, C. J. van Alten, and R. Klein. Play-style identification through deep unsupervised clustering of trajectories. InProceedings of the 2022 IEEE Conference on Games (CoG), pages 393–400, Beijing, China, 2022

work page 2022
[8]

Ingram, C

B. Ingram, C. van Alten, B. Rosman, and R. Klein. Play-style identification and player modelling for generating tailored advice in video games. InProceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE), volume 21, pages 258–268, 2025

work page 2025
[9]

Elbert and C

N. Elbert and C. M. Flath. Process mining for game analytics. InProceedings of the 2024 IEEE Conference on Games (CoG), pages 1–4, Milan, Italy, 2024

work page 2024
[10]

Z. Teng, J. Pfau, and M. S. El-Nasr. Visualization-based iterative segmentation to augment video game analytics. InProceedings of the 2023 IEEE Conference on Games (CoG), Boston, MA, USA, 2023

work page 2023
[11]

Y . Qiu, Y . Gong, and G. Liu. User behavior analysis and clustering in a mmo mobile game: Insights and recommendations.arXiv preprint arXiv:2407.11772, 2024

work page arXiv 2024
[12]

Sembina and L

G. Sembina and L. Naizabayeva. Clustering player performance in pokémon tcg tournaments: A k-means approach to identifying performance groups based on wins, losses, and tournament statistics.International Journal Research on Metaverse, 2(4):269–291, 2025

work page 2025
[13]

Chen, T.-H

Z. Chen, T.-H. D. Nguyen, Y . Xu, C. Amato, S. Cooper, Y . Sun, and M. S. El-Nasr. The art of drafting: A team-oriented hero recommendation system for multiplayer online battle arena games. InProceedings of the 12th ACM Conference on Recommender Systems (RecSys ’18), pages 200–208, 2018

work page 2018
[14]

H. Lee, D. Hwang, H. Kim, B. Lee, and J. Choo. Draftrec: Personalized draft recommendation for winning in multi-player online battle arena games. InProceedings of the ACM Web Conference 2022 (WWW ’22), pages 3428–3439, 2022

work page 2022
[15]

J. Wang, K. Ding, and J. Caverlee. Sequential recommendation for cold-start users with meta transitional learning. InProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21), pages 1783–1787, 2021

work page 2021
[16]

Zang and W

L. Zang and W. Luo. A user-based collaborative filtering system for deck recommendation in game clash royale. In Proceedings of the 2022 IEEE 14th International Conference on Computer Research and Development (ICCRD), pages 126–130, 2022

work page 2022
[17]

C. Zhou, J. Bai, J. Song, X. Liu, Z. Zhao, X. Chen, and J. Gao. Atrank: An attention-based user behavior modeling framework for recommendation. InProceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI), pages 4564–4571, New Orleans, LA, USA, 2018

work page 2018
[18]

Comanici and D

G. Comanici and D. Precup. Optimal policy switching algorithms for reinforcement learning. InProceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS ’10), Vol. 1, pages 709–714, Toronto, ON, Canada, 2010

work page 2010
[19]

Clash royale api.https://developer.clashroyale.com/

Supercell. Clash royale api.https://developer.clashroyale.com/. Accessed: 2026-03-17. 11

work page 2026

[1] [1]

M. Wu, J. S. Lee, and C. Steinkuehler. Understanding tilt in esports: A study on young league of legends players. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI ’21), pages 1–9, 2021

work page 2021

[2] [2]

D. Deng, R. Trepanowski, M. Li, Y . Zhang, M. Buji ´c, and J. Hamari. Streaks and coping: Decoding player performance in league of legends using big data from top players’ matches. InCompanion Proceedings of the Annual Symposium on Computer-Human Interaction in Play (CHI PLAY Companion ’24), pages 50–55, Tampere, Finland, 2024

work page 2024

[3] [3]

T. D. Do, S. I. Wang, D. S. Yu, M. G. McMillian, and R. P. McMahan. Using machine learning to predict game outcomes based on player-champion experience in league of legends. InProceedings of the 16th International Conference on the Foundations of Digital Games (FDG ’21), pages 1–5, Montreal, QC, Canada, 2021

work page 2021

[4] [4]

Kou and X

Y . Kou and X. Gui. Emotion regulation in esports gaming: A qualitative study of league of legends.Proceedings of the ACM on Human-Computer Interaction, 4(CSCW2):1–25, 2020

work page 2020

[5] [5]

Y . Kou, Y . Li, X. Gui, and E. Suzuki-Gill. Playing with streakiness in online games: How players perceive and react to winning and losing streaks in league of legends. InProceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI ’18), pages 1–14, 2018

work page 2018

[6] [6]

Keinan and O

G. Keinan and O. Ben-Porat. Modeling churn in recommender systems with aggregated preferences.arXiv, 2025. arXiv:2502.18483

work page arXiv 2025

[7] [7]

Ingram, B

B. Ingram, B. Rosman, C. J. van Alten, and R. Klein. Play-style identification through deep unsupervised clustering of trajectories. InProceedings of the 2022 IEEE Conference on Games (CoG), pages 393–400, Beijing, China, 2022

work page 2022

[8] [8]

Ingram, C

B. Ingram, C. van Alten, B. Rosman, and R. Klein. Play-style identification and player modelling for generating tailored advice in video games. InProceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE), volume 21, pages 258–268, 2025

work page 2025

[9] [9]

Elbert and C

N. Elbert and C. M. Flath. Process mining for game analytics. InProceedings of the 2024 IEEE Conference on Games (CoG), pages 1–4, Milan, Italy, 2024

work page 2024

[10] [10]

Z. Teng, J. Pfau, and M. S. El-Nasr. Visualization-based iterative segmentation to augment video game analytics. InProceedings of the 2023 IEEE Conference on Games (CoG), Boston, MA, USA, 2023

work page 2023

[11] [11]

Y . Qiu, Y . Gong, and G. Liu. User behavior analysis and clustering in a mmo mobile game: Insights and recommendations.arXiv preprint arXiv:2407.11772, 2024

work page arXiv 2024

[12] [12]

Sembina and L

G. Sembina and L. Naizabayeva. Clustering player performance in pokémon tcg tournaments: A k-means approach to identifying performance groups based on wins, losses, and tournament statistics.International Journal Research on Metaverse, 2(4):269–291, 2025

work page 2025

[13] [13]

Chen, T.-H

Z. Chen, T.-H. D. Nguyen, Y . Xu, C. Amato, S. Cooper, Y . Sun, and M. S. El-Nasr. The art of drafting: A team-oriented hero recommendation system for multiplayer online battle arena games. InProceedings of the 12th ACM Conference on Recommender Systems (RecSys ’18), pages 200–208, 2018

work page 2018

[14] [14]

H. Lee, D. Hwang, H. Kim, B. Lee, and J. Choo. Draftrec: Personalized draft recommendation for winning in multi-player online battle arena games. InProceedings of the ACM Web Conference 2022 (WWW ’22), pages 3428–3439, 2022

work page 2022

[15] [15]

J. Wang, K. Ding, and J. Caverlee. Sequential recommendation for cold-start users with meta transitional learning. InProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21), pages 1783–1787, 2021

work page 2021

[16] [16]

Zang and W

L. Zang and W. Luo. A user-based collaborative filtering system for deck recommendation in game clash royale. In Proceedings of the 2022 IEEE 14th International Conference on Computer Research and Development (ICCRD), pages 126–130, 2022

work page 2022

[17] [17]

C. Zhou, J. Bai, J. Song, X. Liu, Z. Zhao, X. Chen, and J. Gao. Atrank: An attention-based user behavior modeling framework for recommendation. InProceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI), pages 4564–4571, New Orleans, LA, USA, 2018

work page 2018

[18] [18]

Comanici and D

G. Comanici and D. Precup. Optimal policy switching algorithms for reinforcement learning. InProceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS ’10), Vol. 1, pages 709–714, Toronto, ON, Canada, 2010

work page 2010

[19] [19]

Clash royale api.https://developer.clashroyale.com/

Supercell. Clash royale api.https://developer.clashroyale.com/. Accessed: 2026-03-17. 11

work page 2026