A Particle Transformer jet tagger contains a sparse six-head circuit whose source-relay-readout structure recovers most performance and whose residual stream preferentially encodes 2-prong energy correlators.
hub
Graph neural networks in particle physics
11 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 4polarities
background 4representative citing papers
Chirality emerges in SMILES translation models through an abrupt encoder-centered reorganization of representations after a long plateau, identified via checkpoint analysis and ablation.
Derives layer-wise recursions for finite-width tensors under orthogonal initialization that reproduce the observed large-depth stability of nonlinear networks.
Transformers reconstruct the constituent RCFTs in tensor-product theories from low-energy spectra, reaching 98% accuracy on WZW models and generalizing to larger central charges with few out-of-domain examples.
Monte-Carlo simulations with an ML potential demonstrate that coherency strain removes the Ag-Cu miscibility gap in Ag_xCu_{1-x}GaSe2, producing complete mixing.
GSC-QEMit adaptively mitigates quantum errors using hierarchical context clustering, Gaussian-process forecasting, and contextual bandits, delivering 9% higher average logical fidelity than unmitigated runs in Qiskit Aer simulations.
Self-supervised pre-training on multimodal neutrino detector simulations produces reusable representations that improve downstream classification, regression, and data efficiency over training from scratch.
Incorporating probability priors into variational autoregressive networks reduces training burden and enables larger system sizes for sampling in the Ising and Edwards-Anderson models.
Hybrid FPGA-AI Engine deployment of a dynamic GNN for Belle II trigger achieves 2.94M events/s throughput at 7.15us latency with 53% better throughput and DSP usage reduced from 99% to 19%.
Improper use of test data during hyperparameter tuning in link prediction inflates performance estimates by an average of 3.6 percent across 60 networks, as measured by a new Loss Ratio metric.
citing papers explorer
-
Dissecting Jet-Tagger Through Mechanistic Interpretability
A Particle Transformer jet tagger contains a sparse six-head circuit whose source-relay-readout structure recovers most performance and whose residual stream preferentially encodes 2-prong energy correlators.
-
From Syntax to Semantics: Unveiling the Emergence of Chirality in SMILES Translation Models
Chirality emerges in SMILES translation models through an abrupt encoder-centered reorganization of representations after a long plateau, identified via checkpoint analysis and ablation.
-
Criticality and Saturation in Orthogonal Neural Networks
Derives layer-wise recursions for finite-width tensors under orthogonal initialization that reproduce the observed large-depth stability of nonlinear networks.
-
Reconstructing conformal field theoretical compositions with Transformers
Transformers reconstruct the constituent RCFTs in tensor-product theories from low-energy spectra, reaching 98% accuracy on WZW models and generalizing to larger central charges with few out-of-domain examples.
-
Chemo-mechanical coupling stabilizes mixed $\mathrm{Ag}_{x}\mathrm{Cu}_{1-x}\mathrm{GaSe}_{2}$ solar-cell absorbers: Insights from Monte-Carlo simulations assisted by ab initio informed machine-learning potentials
Monte-Carlo simulations with an ML potential demonstrate that coherency strain removes the Ag-Cu miscibility gap in Ag_xCu_{1-x}GaSe2, producing complete mixing.
-
GSC-QEMit: A Telemetry-Driven Hierarchical Forecast-and-Bandit Framework for Adaptive Quantum Error Mitigation
GSC-QEMit adaptively mitigates quantum errors using hierarchical context clustering, Gaussian-process forecasting, and contextual bandits, delivering 9% higher average logical fidelity than unmitigated runs in Qiskit Aer simulations.
-
Towards foundation-style models for energy-frontier heterogeneous neutrino detectors via self-supervised pre-training
Self-supervised pre-training on multimodal neutrino detector simulations produces reusable representations that improve downstream classification, regression, and data efficiency over training from scratch.
-
Variational Autoregressive Networks with probability priors
Incorporating probability priors into variational autoregressive networks reduces training burden and enables larger system sizes for sampling in the Ising and Edwards-Anderson models.
-
Reconfigurable Computing Challenge: Real-Time Graph Neural Networks for Online Event Selection in Big Science
Hybrid FPGA-AI Engine deployment of a dynamic GNN for Belle II trigger achieves 2.94M events/s throughput at 7.15us latency with 53% better throughput and DSP usage reduced from 99% to 19%.
-
Impacts of Data Splitting Strategies on Parameterized Link Prediction Algorithms
Improper use of test data during hyperparameter tuning in link prediction inflates performance estimates by an average of 3.6 percent across 60 networks, as measured by a new Loss Ratio metric.
- FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning