DAIN reframes multimodal fusion as dynamic agent collaboration with sparse activation, claiming SOTA results including 2.6% accuracy gain on ADNI across five benchmarks.
GUI Agents for Continual Game Generation
4 Pith papers cite this work. Polarity classification is still indexing.
abstract
Generating a game is not the same as making one that can be played. Despite advances in code generation, existing approaches treat game generation as one-shot translation from prompt to artifact, leaving interaction-level failures undetected. We argue that evaluating and improving game generation requires a player, and study two roles for graphical user interface (GUI) agents in this process: (1) as an objective evaluator, for which we introduce PlaytestArena, a new evaluation environment that pairs 200 browser-based game generation tasks across eight genres with rubrics of expected in-play behaviors, adjudicated by a GUI agent that loads each build in a browser and plays it; and (2) as a subjective playtester, for which we propose Play2Code, where a game agent and a GUI agent operate in a sustained loop with shared memory, turning game generation into a dialogue between coding and playing. Our experiments show that even frontier models struggle to generate playable games directly, while Play2Code achieves a 66.8\% rubric pass-rate, improving over single-pass and agentic-coding baselines by 37.1 and 14.6 points respectively. Further analysis shows that GUI playtester feedback is more traceable than a human report, yet idiosyncratic in ways reminiscent of human testers, establishing game playtesting as a critical testbed for interactive code generation. Our project website is available at https://continual-game-generation.vercel.app/.
fields
cs.CL 4years
2026 4verdicts
UNVERDICTED 4representative citing papers
A3M integrates adaptive DRL, adversarial opponent modeling, and multi-objective rewards to cut regret 30-40% versus baselines while remaining robust to strategy shifts in repeated auctions.
EVLA combines a Unified Co-State Encoder and Electro-aware Structured Reasoning Chain with physics-guided training to produce energy-optimal driving decisions, reporting +5.6% accuracy gains over fine-tuned VLM baselines on a driving QA benchmark.
FinInvest-GTCN combines graph, temporal, and causal networks with meta-causal adaptation to improve risk-adjusted predictions for VC investments, achieving RA-MSE of 2.51 and 18.7% higher simulated returns on proprietary data.
citing papers explorer
-
DAIN: Dynamic Agent-Based Interaction Network for Efficient and Collaborative Multimodal Reasoning
DAIN reframes multimodal fusion as dynamic agent collaboration with sparse activation, claiming SOTA results including 2.6% accuracy gain on ADNI across five benchmarks.
-
A3M: Adaptive, Adversarial and Multi-Objective Learning for Strategic Bidding in Repeated Auctions
A3M integrates adaptive DRL, adversarial opponent modeling, and multi-objective rewards to cut regret 30-40% versus baselines while remaining robust to strategy shifts in repeated auctions.
-
EVLA: An Electro-Aware Multimodal Assistant for Physically-Grounded Driving Reasoning and Control
EVLA combines a Unified Co-State Encoder and Electro-aware Structured Reasoning Chain with physics-guided training to produce energy-optimal driving decisions, reporting +5.6% accuracy gains over fine-tuned VLM baselines on a driving QA benchmark.
-
FinInvest-GTCN: Explainable Graph-Temporal-Causal Modeling for Risk-Aware Investment Decision Optimization
FinInvest-GTCN combines graph, temporal, and causal networks with meta-causal adaptation to improve risk-adjusted predictions for VC investments, achieving RA-MSE of 2.51 and 18.7% higher simulated returns on proprietary data.