pith. sign in

Craftax: A lightning-fast benchmark for open-ended reinforcement learning.arXiv preprint arXiv:2402.16801

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

fields

cs.LG 3

years

2026 3

representative citing papers

PACE: Parameter Change for Unsupervised Environment Design

cs.LG · 2026-05-02 · unverdicted · novelty 7.0

PACE uses the squared L2 norm of policy parameter changes from a first-order approximation as an efficient proxy for environment value in UED, outperforming baselines with higher IQM and lower optimality gap on MiniGrid and Craftax OOD tests.

Automatic Generation of High-Performance RL Environments

cs.LG · 2026-03-12 · conditional · novelty 7.0

Closed-loop prompt-based translation with hierarchical verification and iterative repair produces equivalent high-performance RL environments across five cases including new TCGJax.

citing papers explorer

Showing 3 of 3 citing papers.

  • PACE: Parameter Change for Unsupervised Environment Design cs.LG · 2026-05-02 · unverdicted · none · ref 5

    PACE uses the squared L2 norm of policy parameter changes from a first-order approximation as an efficient proxy for environment value in UED, outperforming baselines with higher IQM and lower optimality gap on MiniGrid and Craftax OOD tests.

  • Automatic Generation of High-Performance RL Environments cs.LG · 2026-03-12 · conditional · none · ref 14

    Closed-loop prompt-based translation with hierarchical verification and iterative repair produces equivalent high-performance RL environments across five cases including new TCGJax.

  • stable-worldmodel: A Platform for Reproducible World Modeling Research and Evaluation cs.LG · 2026-05-20 · unverdicted · none · ref 43

    The paper presents stable-worldmodel (swm), a platform with high-performance data layer, modern world model baselines, planning solvers, and extended environments for reproducible research and generalization evaluation.