pith. sign in

arxiv: 2606.22447 · v1 · pith:4NHJZT6Dnew · submitted 2026-06-21 · 💻 cs.AI · cs.LG

A Differentiable Atari VCS:A Complex, Fully Known Ground Truth for Explainable AI

Pith reviewed 2026-06-26 11:06 UTC · model grok-4.3

classification 💻 cs.AI cs.LG
keywords differentiable emulatorAtari VCSexplainable AIground truthsurrogate gradientsreinforcement learningVCS hardware
0
0 comments X

The pith

The Atari 2600 VCS hardware can be reformulated so its execution is fully differentiable while remaining bit-identical to the original at any finite temperature.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds two independent differentiable emulators of the Atari VCS that reproduce the standard reference on all 64 supported games with byte-identical RAM states and pixel-identical screens. By recasting cartridge ROM as a weight tensor, RAM as a soft tape, and control flow as gates, it proves the soft forward pass equals the hard discrete one exactly while supplying surrogate gradients through the originally non-differentiable logic. This supplies a system that is both genuinely complex and fully inspectable, removing the usual barrier that prevents verification of explanations in XAI.

Core claim

Treating the cartridge ROM as a weight tensor, RAM as a soft tape, and control flow as gates, the differentiable (soft) execution equals the original (hard) one bit-for-bit in the forward pass at any finite temperature, while exposing surrogate gradients where the bit logic has none. Both the Julia and JAX ports match the reference emulator on every game and open a GPU path for batched rollouts at millions of environment steps per second.

What carries the argument

The soft-logic reformulation that converts VCS ROM into weights, RAM into a continuous tape, and control flow into gates, thereby preserving exact forward equivalence while adding gradients.

If this is right

  • Gradient-based XAI methods become applicable to a complex, fully known system whose every state can be inspected.
  • Batched rollouts on commodity GPUs reach millions of environment steps per second.
  • Explanations produced by any method can be checked directly against the actual cartridge and RAM states.
  • The open-source ports supply a reproducible testbed for any future gradient or explanation technique.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same ROM-as-weights and gate-as-control reformulation could be tried on other fixed-hardware emulators to create additional known-complex test objects.
  • Direct gradient descent through the emulator itself becomes possible for tasks that previously required black-box reinforcement learning.
  • Qualitative studies of gradient flow through the exposed surrogate paths could show how explanations scale from simple logic to full game cartridges.

Load-bearing premise

The soft-logic reformulation preserves exact forward-pass equivalence to the discrete hardware at finite temperature.

What would settle it

Any run of the differentiable emulator on one of the 64 games that produces even a single differing RAM byte or screen pixel compared with the xitari reference.

Figures

Figures reproduced from arXiv: 2606.22447 by Andreas Maier, Patrick Krauss, Siming Bayer.

Figure 1
Figure 1. Figure 1: The Atari 2600 VCS. A 6507 CPU executes code [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The two execution paths share one known mechanism. The HARD step uses integer dispatch and exact bit logic. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Soft-select temperature T in pixel space, on a visu￾alisable selection: a target sprite’s screen column chosen with w = softmax(ℓ/T). For each T we sample the column 100 times, render the hard sprite, and average, so each pixel is the expected occupancy. Low T is a single sharp sprite (the hard pick, σ ≈ 1.3 px). Raising T spreads it over neighbouring columns (σ ≈ 3, then 6 px), widening the range over whi… view at source ↗
Figure 4
Figure 4. Figure 4: The soft-branch gate g = σ(αz)(left) and its gradi￾ent dg/dz = α σ(αz)(1−σ(αz)) (right) for α ∈ {2, 6, 20}, with the operating point z=0.25 marked. Too large an α (here 20) saturates the gate, leaving essentially no gradient away from the switch. Too small an α (here 2) gives a shallow gate whose gradient is weak and barely varies with z. The intermediate α=6 places the operating point in the sigmoid’s ste… view at source ↗
Figure 5
Figure 5. Figure 5: Ground-truth gradients on the real Space Invaders ROM in the differentiable VCS. (a) the live game 35 s af￾ter boot—the scene the real ROM renders, pixel-exact to xitari. (b) the screen’s directional derivative with respect to the right joystick, recovered by a differentiable sam￾pler over the player-cannon position (a sub-pixel bilinear kernel, as in spatial-transformer networks); it lights ex￾actly the c… view at source ↗
Figure 6
Figure 6. Figure 6: Soft branch: the screen-space gradient ∂(screen)/∂z as the sharpness α is swept, on jutari. The target sprite’s placement is a sigmoid-gated blend of a left and a right position, so the gradient is a dipole—blue where a pixel darkens (the placement being vacated) and red where it brightens (the one being entered). The colour bar is in absolute units (here ±0.90) and the per-panel number is the peak relativ… view at source ↗
Figure 7
Figure 7. Figure 7: Overview of the per-step bit-exactness likelihood [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: One frame of the 60-second full-simulator comparison (Space Invaders, beam-timed). HARD and SOFT-STE are pixel-identical (Theorem 3). SOFT α=6, T=0.14 is inside the bit-exact corner and stays identical to HARD for the whole rollout. SOFT α=5.5, T=0.145 is just past the read boundary and diverges mildly: the game stays recognisable but a column of invaders is missing—the localised effect of one corrupted sp… view at source ↗
Figure 9
Figure 9. Figure 9: Implementation effort reconstructed from the version-control log. The step curve is the cumulative number of commits [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: jaxtari soft-mode GPU throughput vs. batch size N (Pong, 3,000 steps; two commodity GPUs, GTX 1080 Ti and Quadro RTX 5000). Forward and forward+gradient both rise ∼1200× from N=1 to a peak near N=4096 (∼3M env￾steps/s, ∼50× the CPU vmap asymptote, dotted), then roll off as the device saturates—the 256-way lax.switch all￾branch dispatch parallelises across lanes on the GPU where it serialises on the CPU. p… view at source ↗
read the original abstract

Explanation requires ground truth: to verify an account of a system we must know its inner functioning-just what is missing where explainable AI (XAI) is most needed. Systems we can study fall into two camps. Simple, procedural one-decision trees, rule lists, sparse linear models-have a known but trivial mechanism, so explaining them tests nothing; genuinely complex ones-deep networks, real-world tasks-need XAI but have no ground-truth inner functioning, so an explanation can be plausible, confident, and wrong with no way to tell. We remove this dichotomy with a study object both genuinely complex and fully specified-inspectable by construction-and, so gradient methods apply, fully differentiable. We reimplement the Atari 2600 Video Computer System (VCS)-a real computer architecture, and the cradle of deep reinforcement learning-as two independent end-to-end differentiable emulators in Julia (jutari) and JAX (jaxtari), each validated bit-for-bit against xitari. Both reproduce xitari on all 64 supported Arcade Learning Environment (ALE) games: 64/64 byte-identical RAM and 64/64 pixel-identical screens. Treating the cartridge ROM as a weight tensor, RAM as a soft tape, and control flow as gates, we prove the differentiable (soft) execution equals the original (hard) one bit-for-bit in the forward pass at any finite temperature, while exposing surrogate gradients where the bit logic has none. The JAX port also opens a GPU path: batched differentiable rollouts reach millions of environment-steps/s on one commodity GPU. The system was built in roughly 137 active hours over 29 calendar days, much of it written autonomously by coding agents. This paper builds and validates the foundation, showing-theoretically and in a qualitative gradient study-that gradient-based XAI on it is feasible. Both ports' full code is available under the MIT license at https://github.com/akmaier/UnderstandingVCS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper claims to have created two end-to-end differentiable emulators of the Atari 2600 VCS (jutari in Julia and jaxtari in JAX) that achieve byte-identical RAM and pixel-identical screen matches with the xitari reference on all 64 supported ALE games. Treating cartridge ROM as a weight tensor, RAM as a soft tape, and control flow as gates, it asserts a proof that the differentiable (soft) execution equals the original (hard) execution bit-for-bit in the forward pass at any finite temperature while exposing surrogate gradients; the JAX version enables GPU batched rollouts at high throughput, and the system is positioned as a fully known ground-truth benchmark for gradient-based XAI methods.

Significance. If the claimed forward-pass equivalence holds, the work supplies a genuinely complex yet fully inspectable and differentiable system for XAI evaluation, directly addressing the dichotomy between trivial known mechanisms and opaque complex ones. The exhaustive 64/64 empirical validation, open MIT-licensed code, and GPU acceleration path constitute concrete strengths that would make the artifact useful for the community.

major comments (1)
  1. [Abstract and theoretical argument] Abstract and theoretical argument: the central claim of exact bit-for-bit forward-pass identity between the soft-logic reformulation and discrete hardware at any finite temperature rests on the specific gate and tape constructions. The supplied evidence is the xitari match; this empirically confirms reproduction of hard behavior but does not independently establish that the soft construction itself is mathematically identical for arbitrary finite temperature, because any unstated assumption about how temperature enters the control-flow gates could break the identity while still passing the xitari test.
minor comments (1)
  1. The abstract refers to 'a qualitative gradient study' demonstrating feasibility of gradient-based XAI; a short description or pointer to the relevant section/figure would improve clarity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed review and the opportunity to clarify the theoretical claims. We respond to the major comment below.

read point-by-point responses
  1. Referee: Abstract and theoretical argument: the central claim of exact bit-for-bit forward-pass identity between the soft-logic reformulation and discrete hardware at any finite temperature rests on the specific gate and tape constructions. The supplied evidence is the xitari match; this empirically confirms reproduction of hard behavior but does not independently establish that the soft construction itself is mathematically identical for arbitrary finite temperature, because any unstated assumption about how temperature enters the control-flow gates could break the identity while still passing the xitari test.

    Authors: We thank the referee for highlighting the need for a clearer separation between the mathematical construction and its empirical validation. The manuscript derives the bit-for-bit forward-pass identity directly from the definitions of the soft gates (which implement exact logical operations via temperature-independent selection) and the soft tape (which preserves exact addressing and state updates). Temperature enters exclusively in the backward pass to supply surrogate gradients; the forward computation is constructed to be identical to the discrete case for any finite temperature by design. The xitari match validates that the implementation faithfully realizes these constructions across all 64 games. We agree that the current exposition could make the independence from temperature more explicit and will revise the theoretical section to include a step-by-step derivation of the identity, stating all assumptions regarding gate and tape behavior. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained via explicit proof and external validation

full rationale

The paper asserts a mathematical proof that the soft execution equals the hard one bit-for-bit at finite temperature, supported by bit-identical validation against the independent xitari emulator on 64 games. No equations define a quantity in terms of itself, no parameters are fitted and then called predictions, and no load-bearing claims rest on self-citations. The central equivalence is presented as proven independently of the empirical match, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper introduces no free parameters or invented entities. It relies on the domain assumption that the Atari 2600 architecture can be faithfully modeled by the described soft operations and on the standard mathematical fact that finite-temperature relaxations of discrete gates admit well-defined surrogate gradients.

axioms (2)
  • domain assumption The xitari reference emulator correctly implements the Atari 2600 hardware specification.
    All validation is performed by matching against xitari outputs.
  • standard math Finite-temperature soft logic gates admit surrogate gradients that do not alter the exact forward-pass identity.
    Invoked in the proof that soft execution equals hard execution bit-for-bit.

pith-pipeline@v0.9.1-grok · 5898 in / 1386 out tokens · 15653 ms · 2026-06-26T11:06:06.927916+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

17 extracted references · 13 canonical work pages · 4 internal anchors

  1. [1]

    Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

    Estimating or Propagating Gradients through Stochastic Neurons for Conditional Computation. arXiv:1308.3432. Bezanson, J.; Edelman, A.; Karpinski, S.; and Shah, V. B

  2. [2]

    Bradbury, J.; Frostig, R.; Hawkins, P.; Johnson, M

    Julia: A Fresh Approach to Numerical Computing.SIAM Review, 59(1): 65–98. Bradbury, J.; Frostig, R.; Hawkins, P.; Johnson, M. J.; Leary, C.; Maclaurin,D.;Necula,G.;Paszke,A.;VanderPlas,J.;Wanderman- Milne,S.;andZhang,Q.2018. JAX:ComposableTransformations of Python+NumPy Programs. http://github.com/google/jax. Soft- ware. Chattopadhyay,A.;Sarkar,A.;Howlade...

  3. [3]

    Dalton, S.; Frosio, I.; and Garland, M

    A Survey on Explainable Deep Reinforcement Learning.arXiv preprint arXiv:2502.06869. Dalton, S.; Frosio, I.; and Garland, M

  4. [4]

    DeepMind

    ArXiv:1907.08467; code at https://github.com/NVlabs/cule. DeepMind

  5. [5]

    https://github.com/google-deepmind/xitari

    Xitari: An Arcade Learning Environment Fork. https://github.com/google-deepmind/xitari. Accessed: 2026-06-

  6. [6]

    arXiv:2306.08649

    OCAtari: Object-Centric Atari 2600 Reinforcement Learning Environments. arXiv:2306.08649. Delfosse, Q.; Emunds, R.; Seitz, P.; Wette, S.; Blüml, J.; and Ker- sting, K

  7. [7]

    https://github.com/k4ntz/JAXAtari

    JAXAtari: A High-Performance Framework for Reasoning Agents. https://github.com/k4ntz/JAXAtari. Software; accessed 2026-06-19. Freeman,C.D.;Frey,E.;Raichuk,A.;Girgin,S.;Mordatch,I.;and Bachem,O.2021.Brax—ADifferentiablePhysicsEngineforLarge Scale Rigid Body Simulation.arXiv preprint arXiv:2106.13281. Goldberg, D

  8. [8]

    Neural Turing Machines

    Neural Turing Machines. arXiv:1410.5401. Greydanus,S.;Koul,A.;Dodge,J.;andFern,A.2018. Visualizing and Understanding Atari Agents. InProceedings of the 35th In- ternational Conference on Machine Learning (ICML), volume 80, 1792–1801. PMLR. IEEE

  9. [9]

    IEEE Std 754-2019

    IEEE Standard for Floating-Point Arithmetic (IEEE Std 754-2019). IEEE Std 754-2019. Innes, M

  10. [10]

    Don't Unroll Adjoint: Differentiating SSA-Form Programs

    Don’t Unroll Adjoint: Differentiating SSA-Form Programs. arXiv:1810.07951. Jaderberg, M.; Simonyan, K.; Zisserman, A.; and Kavukcuoglu, K

  11. [11]

    arXiv:2301.03043

    XDQN: Inherently Interpretable DQN through Mimicking. arXiv:2301.03043. Machado, M. C.; Bellemare, M. G.; Talvitie, E.; Veness, J.; Hausknecht, M.; and Bowling, M

  12. [12]

    Playing Atari with Deep Reinforcement Learning

    Playing Atari with Deep Reinforcement Learning.arXiv preprint arXiv:1312.5602. Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A. A.; Veness, J.; Bellemare, M. G.; Graves, A.; Riedmiller, M.; Fidjeland, A. K.; Ostrovski, G.; Petersen, S.; Beattie, C.; Sadik, A.; Antonoglou, I.; King, H.; Kumaran, D.; Wierstra, D.; Legg, S.; and Hassabis, D

  13. [13]

    arXiv:1908.02511

    Free- Lunch Saliency via Attention in Atari Agents. arXiv:1908.02511. Qing, Y.; Liu, S.; Song, J.; Zhou, Y.; Chen, K.; Wang, H.; and Song, M

  14. [14]

    Raissi, M.; Perdikaris, P.; and Karniadakis, G

    A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, and Challenges.arXiv preprint arXiv:2211.06665. Raissi, M.; Perdikaris, P.; and Karniadakis, G. E

  15. [15]

    ASurveyofExplainableReinforcementLearn- ing:Targets,MethodsandNeeds.arXivpreprintarXiv:2507.12599

    Saulières,L.2025. ASurveyofExplainableReinforcementLearn- ing:Targets,MethodsandNeeds.arXivpreprintarXiv:2507.12599. Selvaraju, R. R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; and Batra, D

  16. [16]

    StellaTeam.2024

    Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization.International Journal ofComputerVision,128(2):336–359.OriginallyarXiv:1610.02391 (2016). StellaTeam.2024. Stella:AMulti-PlatformAtari2600VCSEmu- lator. https://stella-emu.github.io. Accessed: 2026-06-16. Such, F. P.; Madhavan, V.; Liu, R.; Wang, R.; Castro, P. S.; Li, Y.; Zhi...

  17. [17]

    and its backward pass is a surrogate.Thefullyrelaxedmode(full)isusedonlyforthe temperature-limit analysis (Theorem 4); its forward pass is bit-exact only inside the corner of smallTand largeα. Mode Forward Gradient Used for hardbit-exact none conformance soft-ste=hardsurrogate attribution fullexact in corner relaxedT→0study Numerical scope.Soft mode keeps...