From LLM-Driven Trading Card Generation to Procedural Relatedness: A Pok\'emon Case Study
Pith reviewed 2026-05-07 07:36 UTC · model grok-4.3
The pith
A pipeline of large language models and diffusion models generates personalized Pokémon trading cards that let players realize their own ideas through prompt adjustments.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors present a pipeline combining player-centric co-creation, fine-tuned embeddings, local LLMs, and diffusion models to generate dynamic, personalized Pokémon cards. Evaluation in a user study with 49 participants who produced 196 samples showed high satisfaction and indicated that most participants successfully realized their own ideas through prompt adjustments. These findings lay groundwork for future content generation systems and alternatives to conventional metagame evolution through procedural relatedness.
What carries the argument
The player-centric co-creation pipeline that processes user prompts via fine-tuned embeddings and local LLMs to direct diffusion models in producing card text and images.
If this is right
- TCGs could sustain engagement through an infinity of personalized card designs rather than periodic official updates.
- Players could develop unique connections to their cards via procedural relatedness.
- This offers an alternative path for metagame evolution that reduces repetitive strategies.
- Large-scale personalized content creation becomes feasible while expanding the creative options available to players.
Where Pith is reading between the lines
- The reliance on local LLMs suggests the system could run offline, allowing card generation without external data sharing.
- Procedural relatedness might change collecting behaviors, as players value self-generated cards differently from mass-produced ones.
- Similar pipelines could extend to other domains needing custom game pieces, such as custom board game components.
Load-bearing premise
That ratings for visual appeal and self-reported idea realization are enough to conclude the cards would be mechanically balanced and enjoyable when actually played in TCG matches.
What would settle it
A playtest in which participants use generated cards in actual matches and report on mechanical balance, strategic options, and overall enjoyment relative to official cards.
Figures
read the original abstract
Since the dawn of Trading Card Games, the genre has grown into a multi-billion-dollar industry engaging millions of analog and digital players worldwide. Popular TCGs rely on regular updates, balance adjustments, and rotating constraints to sustain engagement. Yet, as metagames stabilize, predictable strategies dominate and viable card options diminish, often resulting in repetitive and impaired player experiences. This paper investigates the use of Large Language Models and Image Diffusion Models for Procedural Content Generation of TCG cards, addressing these challenges by enabling a personalized infinity of card designs. Modern generative AI not only enables large-scale content creation but could even introduce procedural relatedness, fostering unique connections between players and their cards. We present a pipeline combining player-centric co-creation, fine-tuned embeddings, local LLMs, and Diffusion Models to generate dynamic, personalized cards while potentially expanding creative range. We evaluated the pipeline in a user study with 49 participants who generated 196 Pok\'emon card samples. Participants rated aesthetics and representativeness of visuals and mechanics, and provided qualitative feedback. Results show high satisfaction and indicate that most participants successfully realized their own ideas through prompt adjustments. These findings lay groundwork for future content generation systems and alternatives to conventional metagame evolution through procedural relatedness.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a pipeline combining LLMs, fine-tuned embeddings, local models, and diffusion models for player-centric generation of personalized Pokémon TCG cards. It evaluates the approach through a user study with 49 participants who produced 196 card samples, collecting ratings on visual/mechanical aesthetics and representativeness along with qualitative feedback on realizing personal ideas via prompt adjustments. Results are reported as showing high satisfaction, with most participants successfully realizing ideas, and the work positions this as groundwork for procedural relatedness and alternatives to traditional metagame evolution in TCGs.
Significance. If the user-study findings hold, the paper demonstrates the practical feasibility of generative AI for scalable, personalized TCG content creation, with the direct collection of ratings and feedback from 49 participants across 196 samples serving as a concrete empirical contribution. This could support future co-creation systems. However, the broader significance for claims about addressing metagame stagnation and repetitive play is limited, as the evaluation does not include objective mechanical validation. The introduction of procedural relatedness remains conceptual rather than evidenced through unique player-card connections.
major comments (2)
- [Introduction] Introduction (paragraph 2): The text frames metagame stabilization, balance adjustments, and repetitive play as core TCG challenges that the pipeline addresses via personalized cards and procedural relatedness. The evaluation section, however, reports only subjective ratings on aesthetics, representativeness, and self-reported idea realization, with no power-level estimation, synergy detection, simulated gameplay, or post-generation playtests. This leaves the claim of a viable alternative to conventional metagame evolution resting on an untested assumption about downstream mechanical usability and balance.
- [Evaluation] Evaluation section (user study description): The abstract and results state that the study shows 'high satisfaction' and that 'most participants successfully realized their own ideas,' yet no details are provided on rating scales, statistical tests, inter-rater reliability, error bars, confidence intervals, or baseline comparisons. This directly affects the robustness of the central empirical claims about satisfaction and idea realization.
minor comments (2)
- [Abstract] Abstract: The term 'procedural relatedness' is introduced without a brief definition or cross-reference to its elaboration in the main text, which may leave readers unclear on its precise meaning.
- [Pipeline] Pipeline description: Adding a diagram or pseudocode for the integration of fine-tuned embeddings with the LLM and diffusion stages would improve clarity and support reproducibility.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. We address each major comment below, agreeing that the introduction requires clarification to better match the scope of the evaluation and that the evaluation section needs expanded statistical reporting for transparency. Revisions will be made accordingly to strengthen the manuscript without overstating the current results.
read point-by-point responses
-
Referee: [Introduction] Introduction (paragraph 2): The text frames metagame stabilization, balance adjustments, and repetitive play as core TCG challenges that the pipeline addresses via personalized cards and procedural relatedness. The evaluation section, however, reports only subjective ratings on aesthetics, representativeness, and self-reported idea realization, with no power-level estimation, synergy detection, simulated gameplay, or post-generation playtests. This leaves the claim of a viable alternative to conventional metagame evolution resting on an untested assumption about downstream mechanical usability and balance.
Authors: We agree that the evaluation is limited to subjective user ratings on aesthetics, representativeness, and self-reported idea realization, without objective mechanical validation such as power-level analysis or gameplay simulations. The manuscript already frames the contribution as laying groundwork for procedural relatedness and alternatives to metagame evolution rather than claiming a complete solution. To ensure consistency and avoid any implication of downstream mechanical validation, we will revise the introduction to explicitly note that the current study focuses on generative feasibility and user satisfaction, while positioning mechanical balance and gameplay testing as important directions for future work. This revision will be incorporated in the next version. revision: yes
-
Referee: [Evaluation] Evaluation section (user study description): The abstract and results state that the study shows 'high satisfaction' and that 'most participants successfully realized their own ideas,' yet no details are provided on rating scales, statistical tests, inter-rater reliability, error bars, confidence intervals, or baseline comparisons. This directly affects the robustness of the central empirical claims about satisfaction and idea realization.
Authors: We acknowledge that additional details on the rating methodology and statistical reporting are necessary to support the robustness of the claims. The user study involved 49 participants generating 196 samples with ratings on visual/mechanical aesthetics and representativeness plus qualitative feedback. In the revised manuscript, we will expand the Evaluation section to specify the rating scales (5-point Likert), report means, standard deviations, and any statistical tests performed, include error bars or confidence intervals in figures, and address inter-rater reliability where multiple ratings apply. Baseline comparisons will be discussed if feasible with the collected data. These additions will be made without changing the core findings of high satisfaction and idea realization. revision: yes
Circularity Check
No circularity in empirical user study of generative pipeline
full rationale
The paper presents an LLM- and diffusion-model-based pipeline for generating Pokémon trading cards and evaluates it through a direct user study (49 participants, 196 cards) that collects ratings on aesthetics, representativeness, and self-reported idea realization. No mathematical derivations, equations, fitted parameters, or predictions are described that reduce by construction to the study's own inputs. The concept of 'procedural relatedness' is introduced conceptually as a potential outcome of the pipeline rather than being defined in terms of itself or derived from self-citations. All central claims rest on the independent participant data and qualitative feedback, rendering the work self-contained with no load-bearing self-referential steps.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
2019.Board Game Design Day: ’KeyForge’: Creating the World’s First Unique Deck Game | GDC. Retrieved February 16, 2026 from https://www.gdcvault.com/play/ 1025682/Board-Game-Design-Day-KeyForge
work page 2019
-
[2]
2026.Solforge | Kickstarter. Retrieved February 16, 2026 from https://www.kickstarter.com/projects/1965800643/solforge-digital-trading- card-game/description
-
[3]
Eslam Mohamed Bakr, Pengzhan Sun, Xiaoqian Shen, Faizan Farooq Khan, Li Er- ran Li, and Mohamed Elhoseiny. 2023. Hrs-bench: Holistic, reliable and scalable benchmark for text-to-image models. InProceedings of the IEEE/CVF International Conference on Computer Vision. 20041–20053
work page 2023
-
[4]
Aditya Bhatt, Scott Lee, Fernando de Mesentier Silva, Connor W Watson, Julian Togelius, and Amy K Hoover. 2018. Exploring the hearthstone deck space. In Proceedings of the 13th international conference on the foundations of digital games. 1–10
work page 2018
-
[5]
Blizzard Entertainment. 2014.Hearthstone. Game [PC]. Blizzard Entertainment, Irvine, California, USA
work page 2014
-
[6]
Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology3, 2 (2006), 77–101
work page 2006
-
[7]
Marcus Carter, Martin Gibbs, and Mitchell Harrop. 2012. Metagames, paragames and orthogames: A new vocabulary. InProceedings of the international conference on the foundations of digital games. 11–17
work page 2012
-
[8]
Tiannan Chen and Stephen Guy. 2020. Chaos cards: Creating novel digital card games through grammatical content generation and meta-based card evaluation. InProceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Vol. 16. 196–202
work page 2020
-
[9]
Brian David-Marshall, Joost van Dreunen, and Matthew Wang. 2010. Trading Card Game Industry
work page 2010
-
[10]
Joe Deaux. 2019. Move over monopoly: Hasbro’s next big growth engine is magic. (2019). https://www.bloomberg.com/news/articles/2019-07-07/hasbro-s-free- magic-the-gathering-arena-official-launch-is-2019
work page 2019
-
[11]
Asbjørn Følstad and Marita Skjuve. 2019. Chatbots for customer service: user experience and motivation. InProceedings of the 1st international conference on conversational user interfaces. 1–9
work page 2019
-
[12]
Jose M Font, Tobias Mahlmann, Daniel Manrique, and Julian Togelius. 2013. A card game description language. InApplications of Evolutionary Computation: 16th European Conference, EvoApplications 2013, Vienna, Austria, April 3-5, 2013. Proceedings 16. Springer, 254–263
work page 2013
-
[13]
Roberto Gallotta, Antonios Liapis, and Georgios Yannakakis. 2024. Consistent game content creation via function calling for large language models. In2024 IEEE Conference on Games (CoG). IEEE, 1–4
work page 2024
-
[14]
Roberto Gallotta, Antonios Liapis, and Georgios Yannakakis. 2024. LLMaker: A Game Level Design Interface Using (Only) Natural Language. In2024 IEEE Conference on Games (CoG). 1–2. doi:10.1109/CoG60054.2024.10645626
-
[15]
Riot Games. 2020. Legends of Runeterra.Riot Games25 (2020)
work page 2020
-
[16]
Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yixin Dai, Jiawei Sun, Haofen Wang, and Haofen Wang. 2023. Retrieval-augmented generation for large language models: A survey.arXiv preprint arXiv:2312.10997 2, 1 (2023)
work page internal anchor Pith review arXiv 2023
-
[17]
Richard Garfield. 1993. Magic: The Gathering.Wizards of the Coast27 (1993), 28
work page 1993
-
[18]
Hasbro. 2024. Magic: The Gathering. Hasbro Investors. (2024). https://investor. hasbro.com/magic-gathering
work page 2024
-
[19]
Mark Hendrikx, Sebastiaan Meijer, Joeri Van Der Velden, and Alexandru Iosup
-
[20]
Procedural content generation for games: A survey.ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)9, 1 (2013), 1–22
work page 2013
-
[21]
Athanasios Kokkinakis, Peter York, Moni Patra, Justus Robertson, Ben Kirman, Al- istair Coates, Alan Pedrassoli Chitayat, Simon Peter Demediuk, Anders Drachen, Jonathan David Hook, et al. 2021. Metagaming and metagames in Esports.Inter- national Journal of Esports(2021)
work page 2021
-
[22]
1999.Yu-Gi-Oh! Trading Card Game
Konami. 1999.Yu-Gi-Oh! Trading Card Game. Game. Konami, Tokyo, Japan
work page 1999
-
[23]
Vikram Kumaran, Dan Carpenter, Jonathan Rowe, Bradford Mott, and James Lester. 2023. End-to-end procedural level generation in educational games with natural language instruction. In2023 IEEE Conference on Games (CoG). IEEE, 1–8
work page 2023
-
[24]
Vikram Kumaran, Bradford Mott, and James Lester. 2019. Generating game levels for multiple distinct games with a common latent space. InProceedings of the AAAI conference on artificial intelligence and interactive digital entertainment, Vol. 15. 102–108
work page 2019
-
[25]
Jian Ma, Junhao Liang, Chen Chen, and Haonan Lu. 2024. Subject-diffusion: Open domain personalized text-to-image generation without test-time fine-tuning. In ACM SIGGRAPH 2024 Conference Papers. 1–12
work page 2024
-
[26]
Daniel Sumner Magruder. 2022. A conservative metric of power creep.Games and Culture17, 5 (2022), 721–751
work page 2022
-
[27]
Mahdi Farrokhi Maleki and Richard Zhao. 2024. Procedural content generation in games: A survey with insights on emerging llm integration. InProceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Vol. 20. 167–178
work page 2024
-
[28]
David Mannström. 2022. Power creep in videogames, an analysis of the competi- tive scene in Pokémon games. (2022)
work page 2022
-
[29]
Sai Siddartha Maram, Johannes Pfau, Jai Bhagu Dodechani, and Magy Seif El- Nasr. 2023. A visual ethnographic study at cultural spaces to identify character creation opportunities. InProceedings of the 18th International Conference on the Foundations of Digital Games. 1–12
work page 2023
-
[30]
Brian Merchant. 2015. The AI That Learned Magic (the Gathering). (2015). https://www.vice.com/en/article/the-ai-that-learned-magic-the-gathering/
work page 2015
-
[31]
Morgan Milewicz. 2015. Generating Magic cards using deep, recurrent neural networks. (2015). https://www.mtgsalvation.com/forums/magic- fundamentals/custom-card-creation/612057-generating-magic-cards-using- deep-recurrent-neural
work page 2015
- [32]
-
[33]
Shravan Nayak, Mehar Bhatia, Xiaofeng Zhang, Verena Rieser, Lisa Anne Hen- dricks, Sjoerd Van Steenkiste, Yash Goyal, Karolina Stańczak, and Aishwarya Agrawal. 2025. Culturalframes: Assessing cultural expectation alignment in text-to-image models and evaluation metrics. InFindings of the Association for Computational Linguistics: EMNLP 2025. 20918–20953
work page 2025
- [34]
-
[35]
Johannes Pfau. 2025. Progression Balancing × Baldur’s Gate 3: Insights, Terms and Tools for Multi-Dimensional Video Game Balance. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–12
work page 2025
-
[36]
Johannes Pfau, Manik Charan, Erica Kleinman, and Magy Seif El-Nasr. 2024. Damage Optimization in Video Games: A Player-Driven Co-Creative Approach. InProceedings of the CHI Conference on Human Factors in Computing Systems. 1–16
work page 2024
-
[37]
Johannes Pfau, Antonios Liapis, Georg Volkmar, Georgios N Yannakakis, and Rainer Malaka. 2020. Dungeons & Replicants: Automated Game Balancing via Deep Player Behavior Modeling. In2020 IEEE Conference on Games (CoG). IEEE, 431–438
work page 2020
-
[38]
Johannes Pfau, Antonios Liapis, Georgios N Yannakakis, and Rainer Malaka. 2022. Dungeons & Replicants II: Automated Game Balancing Across Multiple Difficulty Dimensions via Deep Player Behavior Modeling.IEEE Transactions on Games (2022). From LLM-Driven Trading Card Generation to Procedural Relatedness: A Pokémon Case Study , 978-1-4503-XXXX-X/2018/06
work page 2022
-
[39]
Jacob Schrum, Vanessa Volz, and Sebastian Risi. 2020. Cppn2gan: Combining compositional pattern producing networks and gans for large-scale pattern gener- ation. InProceedings of the 2020 Genetic and Evolutionary Computation Conference. 139–147
work page 2020
-
[40]
Adam Summerville and Michael Mateas. 2016. Mystical tutor: A magic: The gathering design assistant via denoising sequence-to-sequence learning. InPro- ceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Vol. 12. 86–92
work page 2016
-
[41]
Adam Summerville, Sam Snodgrass, Matthew Guzdial, Christoffer Holmgård, Amy K Hoover, Aaron Isaksen, Andy Nealen, and Julian Togelius. 2018. Proce- dural content generation via machine learning (PCGML).IEEE Transactions on Games10, 3 (2018), 257–270
work page 2018
-
[42]
Yuqian Sun, Zhouyi Li, Ke Fang, Chang Hee Lee, and Ali Asadipour. 2023. Language as Reality: A Co-Creative Storytelling Game Experience in 1001 Nights Using Generative AI.Proceedings of the AAAI Conference on Artifi- cial Intelligence and Interactive Digital Entertainment19, 1 (Oct. 2023), 425–434. doi:10.1609/aiide.v19i1.27539
-
[43]
Qwen Team. 2025. Qwen3 Technical Report. arXiv:2505.09388 [cs.CL] https: //arxiv.org/abs/2505.09388
work page internal anchor Pith review arXiv 2025
-
[44]
1996.Pokémon Trading Card Game
The Pokémon Company. 1996.Pokémon Trading Card Game. Game. The Pokémon Company, Tokyo, Japan
work page 1996
-
[45]
Michael Thielscher. 2010. A general game description language for incomplete information games. InProceedings of the AAAI conference on artificial intelligence, Vol. 24. 994–999
work page 2010
-
[46]
Graham Todd, Sam Earle, Muhammad Umair Nasir, Michael Cerny Green, and Julian Togelius. 2023. Level generation through large language models. InPro- ceedings of the 18th International Conference on the Foundations of Digital Games. 1–8
work page 2023
-
[47]
Julian Togelius, Georgios N Yannakakis, Kenneth O Stanley, and Cameron Browne
-
[48]
Search-based procedural content generation: A taxonomy and survey.IEEE Transactions on Computational Intelligence and AI in Games3, 3 (2011), 172–186
work page 2011
-
[49]
Riemer van Rozen, Anders Bouwer, and Karel Millenaar. 2023. Towards a Uni- fied Language for Card Game Design. InProceedings of the 18th International Conference on the Foundations of Digital Games. 1–4
work page 2023
- [50]
-
[51]
2019.Magic: The Gathering Arena
Wizards of the Coast. 2019.Magic: The Gathering Arena. Game [PC]. Wizards of the Coast, Renton, Washington, USA
work page 2019
-
[52]
Georgios N Yannakakis and Julian Togelius. 2011. Experience-driven procedural content generation.IEEE Transactions on Affective Computing2, 3 (2011), 147–161
work page 2011
-
[53]
Marvin Zammit, Antonios Liapis, and Georgios N Yannakakis. 2024. CrawLLM: Theming games with large language models. In2024 IEEE Conference on Games (CoG). IEEE, 1–2
work page 2024
-
[54]
Andrew Zhu, Lara Martin, Andrew Head, and Chris Callison-Burch. 2023. CA- LYPSO: LLMs as Dungeon Master’s Assistants. InProceedings of the AAAI Con- ference on Artificial Intelligence and Interactive Digital Entertainment, Vol. 19. 380–390. Received 20 February 2007; revised 12 March 2009; accepted 5 June 2009
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.