pith. machine review for the scientific record. sign in

arxiv: 2505.03258 · v2 · submitted 2025-05-06 · ✦ hep-ph · hep-ex

Recognition: unknown

IAFormer: Interaction-Aware Transformer network for collider data analysis

Authors on Pith no claims yet
classification ✦ hep-ph hep-ex
keywords attentioniaformernetworktextttparticlesparsetransformermechanism
0
0 comments X
read the original abstract

In this paper, we introduce \texttt{IAFormer}, a novel Transformer-based architecture that efficiently integrates pairwise particle interactions through a dynamic sparse attention mechanism. \texttt{IAFormer} has two new mechanisms within the model. First, the attention matrix depends on predefined boost invariant pairwise quantities, reducing the network parameters significantly from the original particle transformer models. Second, \texttt{IAFormer} incorporates the sparse attention mechanism by utilizing the "differential attention", so that it can dynamically prioritize relevant particle tokens while reducing computational overhead associated with less informative ones. This approach significantly lowers the model complexity without compromising performance. Despite being computationally efficient by more than an order of magnitude than the Particle Transformer network, \texttt{IAFormer} achieves state-of-the-art performance in classification tasks on the top and quark-gluon datasets. Furthermore, we employ AI interpretability techniques, verifying that the model effectively captures physically meaningful information layer by layer through its sparse attention mechanism, building an efficient network output that is resistant to statistical fluctuations. \texttt{IAFormer} highlights the need for sparse attention in Transformer analysis to reduce the network size while improving its performance.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Dissecting Jet-Tagger Through Mechanistic Interpretability

    hep-ph 2026-05 accept novelty 8.0

    A Particle Transformer jet tagger contains a sparse six-head circuit whose source-relay-readout structure recovers most performance and whose residual stream preferentially encodes 2-prong energy correlators.