What exactly did the Transformer learn from our physics data?

· 2025 · astro-ph.IM · arXiv 2505.21042

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Transformer networks excel in scientific applications. We explore two scenarios in ultra-high-energy cosmic ray simulations to examine what these network architectures learn. First, we investigate the trained positional encodings in air showers which are azimuthally symmetric. Second, we visualize the attention values assigned to cosmic particles originating from a galaxy catalog. In both cases, the Transformers learn plausible, physically meaningful features.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Dissecting Jet-Tagger Through Mechanistic Interpretability

hep-ph · 2026-05-11 · accept · novelty 8.0

A Particle Transformer jet tagger contains a sparse six-head circuit whose source-relay-readout structure recovers most performance and whose residual stream preferentially encodes 2-prong energy correlators.

citing papers explorer

Showing 1 of 1 citing paper.

Dissecting Jet-Tagger Through Mechanistic Interpretability hep-ph · 2026-05-11 · accept · none · ref 36 · internal anchor
A Particle Transformer jet tagger contains a sparse six-head circuit whose source-relay-readout structure recovers most performance and whose residual stream preferentially encodes 2-prong energy correlators.

What exactly did the Transformer learn from our physics data?

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer