arxiv: 2604.05034 · v1 · submitted 2026-04-06 · ✦ hep-ph · cs.LG· hep-th

Recognition: 2 theorem links

· Lean Theorem

Learning to Unscramble Feynman Loop Integrals with SAILIR

David Shih

Authors on Pith no claims yet

Pith reviewed 2026-05-10 19:26 UTC · model grok-4.3

classification ✦ hep-ph cs.LGhep-th

keywords Feynman integralsIBP reductionmachine learningself-supervised learningloop integralsprecision calculations

0 comments

The pith

SAILIR trains a classifier on synthetic scrambled integrals to reduce Feynman loop integrals step by step with memory that stays constant regardless of complexity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that integration-by-parts reduction of Feynman integrals can be performed by a transformer model that learns to undo artificially scrambled reduction identities in reverse. Because the process runs online without assembling large equation systems, memory consumption remains flat even for high-weight integrals. This matters for precision calculations in high-energy physics, where traditional Laporta-style methods quickly exhaust available memory as integral complexity increases. Benchmarks on two-loop triangle-box integrals show SAILIR using roughly 40 percent of Kira's memory on the hardest cases while taking comparable time.

Core claim

A transformer classifier trained entirely self-supervised on data generated by applying known IBP identities in reverse can guide one-step-at-a-time reduction of Feynman integrals. When paired with beam search in a parallel asynchronous setup, the method reduces integrals of arbitrarily high weight while keeping memory usage bounded, in contrast to conventional solvers whose memory scales rapidly with complexity.

What carries the argument

The self-supervised transformer classifier that learns to select the next correct reduction step by reversing synthetic scrambles of known identities.

If this is right

IBP reduction becomes possible for integrals whose complexity would otherwise exhaust memory in conventional solvers.
Precision calculations previously limited by memory can proceed using far less RAM per worker.
The online, one-step reduction strategy removes the need to solve large linear systems at once.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same scramble-and-unscramble training could be applied to other symbolic reduction problems that currently face memory scaling walls.
If the learned classifier generalizes across topologies, it could serve as a lightweight front end that hands off only unsolved integrals to traditional tools.
Extending the synthetic data generation to include more topologies would test whether the approach remains reliable without retraining on real integrals.

Load-bearing premise

A classifier trained only on synthetic scrambled data will reliably select reduction steps that succeed on real high-weight Feynman integrals without entering dead-end paths that beam search cannot escape.

What would settle it

Finding a real high-weight integral that Kira reduces successfully but where SAILIR's classifier choices lead to a dead end with no recovery by beam search.

Figures

Figures reproduced from arXiv: 2604.05034 by David Shih.

**Figure 2.** Figure 2: FIG. 2. Training and validation cross-entropy loss (left) and [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3 [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

read the original abstract

Integration-by-parts (IBP) reduction of Feynman integrals to master integrals is a key computational bottleneck in precision calculations in high-energy physics. Traditional approaches based on the Laporta algorithm require solving large systems of equations, leading to memory consumption that grows rapidly with integral complexity. We present SAILIR (Self-supervised AI for Loop Integral Reduction), a new machine learning approach in which a transformer-based classifier guides the reduction of integrals one step at a time in a fully online fashion. The classifier is trained in an entirely self-supervised manner on synthetic data generated by a scramble/unscramble procedure: known reduction identities are applied in reverse to build expressions of increasing complexity, and the classifier learns to undo these steps. When combined with beam search and a highly parallelized, asynchronous, single-episode reduction strategy, SAILIR can reduce integrals of arbitrarily high weight with bounded memory. We benchmark SAILIR on the two-loop triangle-box topology, comparing against the state-of-the-art IBP reduction code Kira across 16 integrals of varying complexity. While SAILIR is slower in wall-clock time, its per-worker memory consumption remains approximately flat regardless of integral complexity, in contrast to Kira whose memory grows rapidly with complexity. For the most complex integrals considered here, SAILIR uses only 40\% of the memory of Kira while achieving comparable reduction times. This demonstrates a fundamentally new paradigm for IBP reduction in which the memory bottleneck of Laporta-based approaches could be entirely overcome, potentially opening the door to precision calculations that are currently intractable.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SAILIR shows flat memory for IBP reduction on one topology via self-supervised transformers but leaves generalization to real high-weight integrals untested.

read the letter

The main takeaway is that this paper presents a transformer-based system for reducing Feynman loop integrals step by step, trained entirely on synthetic scrambled data, and it keeps memory usage flat across increasing complexity. It does well by showing concrete gains over Kira on 16 two-loop integrals from the triangle-box family. Memory stays roughly constant per worker no matter the integral size, hitting 40% of Kira's usage on the hardest cases with similar times. The self-supervised scramble/unscramble training is a nice way to generate data without manual labels. The limitation is that everything is confined to that one topology. No tests on other structures or higher loops, so the claim of handling arbitrarily high weight with bounded memory is not yet backed by broad evidence. The concern about the classifier entering unrecoverable paths on real integrals seems valid given the synthetic training distribution. This is relevant for high-energy physicists working on precision calculations that hit memory walls with traditional IBP tools. Someone looking for new algorithmic ideas in this area would find it worth reading. I would recommend sending it for peer review to get feedback on scaling and generalization.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces SAILIR, a transformer-based classifier trained in a fully self-supervised manner on synthetic data generated by reversing known IBP identities (scramble/unscramble procedure). The classifier guides online, step-by-step IBP reduction of Feynman loop integrals, combined with beam search and asynchronous single-episode execution to achieve bounded memory usage independent of integral weight. Benchmarks are performed on 16 two-loop integrals in the triangle-box topology, showing flat per-worker memory consumption (approximately 40% of Kira for the most complex cases) while achieving comparable reduction times, albeit with higher wall-clock time.

Significance. If the generalization from synthetic training data to real high-weight integrals holds, the method would represent a significant advance by decoupling memory requirements from integral complexity, addressing a core limitation of Laporta-style algorithms in precision hep-ph calculations. The self-supervised training avoids circularity and the online reduction strategy is a clear technical strength; however, the current evidence is confined to a single topology and does not yet establish the broad applicability claimed.

major comments (3)

[Abstract, §4 (benchmarks)] The central claim that SAILIR reduces integrals of arbitrarily high weight with bounded memory rests on reliable step selection by the classifier. However, all benchmarks and training are restricted to the two-loop triangle-box topology (16 integrals); no results are shown for other topologies, higher loops, or out-of-distribution structures. This directly limits support for the generalization required by the abstract and introduction claims.
[§3 (training procedure), §4] The manuscript provides no quantitative details on training convergence, validation loss curves, generalization error on held-out synthetic data, or observed failure modes (e.g., dead-end paths that beam search cannot recover from). Without these, it is difficult to assess the reliability of the classifier on real integrals beyond the reported cases.
[§2 (method), §4] The bounded-memory guarantee is demonstrated only empirically on the 16 selected integrals; the paper does not provide a theoretical argument or scaling analysis showing why the beam-search depth and classifier accuracy remain sufficient as weight increases beyond the tested range.

minor comments (2)

[Abstract] The abstract states 'comparable reduction times' but the text should clarify whether this refers to per-worker CPU time or total wall-clock time, given the noted higher wall-clock for SAILIR.
[§2] Notation for the beam-search parameters (width, depth) and the precise definition of 'weight' used in the benchmarks should be made explicit in the methods section for reproducibility.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for their detailed and constructive comments on our manuscript. We are pleased that the potential of the self-supervised learning approach and the online reduction strategy for achieving bounded memory usage has been recognized. We address each of the major comments below and will make appropriate revisions to the manuscript to improve clarity and provide additional supporting information.

read point-by-point responses

Referee: [Abstract, §4 (benchmarks)] The central claim that SAILIR reduces integrals of arbitrarily high weight with bounded memory rests on reliable step selection by the classifier. However, all benchmarks and training are restricted to the two-loop triangle-box topology (16 integrals); no results are shown for other topologies, higher loops, or out-of-distribution structures. This directly limits support for the generalization required by the abstract and introduction claims.

Authors: We concur that the current benchmarks are confined to the two-loop triangle-box topology with 16 integrals. The self-supervised training via the scramble/unscramble procedure is topology-independent in its formulation, relying on reversing IBP identities to generate training data. To address the concern, we will revise the abstract and introduction to more precisely state that the method enables bounded-memory reduction for high-weight integrals in the tested topology, while highlighting the general applicability of the framework. We will include a brief discussion on extending to other topologies. This revision will temper the generalization claims to align with the presented evidence. revision: partial
Referee: [§3 (training procedure), §4] The manuscript provides no quantitative details on training convergence, validation loss curves, generalization error on held-out synthetic data, or observed failure modes (e.g., dead-end paths that beam search cannot recover from). Without these, it is difficult to assess the reliability of the classifier on real integrals beyond the reported cases.

Authors: This is a valid point, and we will incorporate the requested details in the revised version. Specifically, we will add figures showing the training and validation loss curves to demonstrate convergence. Additionally, we will report the accuracy on held-out synthetic data and describe any failure modes encountered, such as dead-end paths, along with how the beam search strategy helps in recovering from suboptimal classifier predictions. These additions will provide a clearer picture of the model's reliability. revision: yes
Referee: [§2 (method), §4] The bounded-memory guarantee is demonstrated only empirically on the 16 selected integrals; the paper does not provide a theoretical argument or scaling analysis showing why the beam-search depth and classifier accuracy remain sufficient as weight increases beyond the tested range.

Authors: The memory bound is inherent to the online, single-episode reduction with fixed beam width and asynchronous execution, as memory usage depends only on the beam size and current integral representations rather than the cumulative system size or total weight. We will add a dedicated subsection providing a scaling analysis based on the empirical data and explaining the design choices that support bounded usage. Regarding classifier accuracy at higher weights, while empirical evidence supports it within the tested range, a comprehensive theoretical analysis of generalization is not provided and would require further research. We will note this limitation explicitly. revision: partial

standing simulated objections not resolved

A full theoretical proof or rigorous scaling analysis guaranteeing the classifier's performance for weights significantly beyond the tested range.

Circularity Check

0 steps flagged

No significant circularity; claims rest on independent empirical benchmarks

full rationale

The paper's core method trains a transformer classifier via self-supervised reversal of known IBP identities on synthetic data, then evaluates reduction performance by direct comparison to the external Kira solver on 16 real integrals from the two-loop triangle-box topology. No load-bearing step reduces by construction to its inputs: the bounded-memory claim follows from the online, step-by-step design rather than any fitted parameter or self-referential prediction; the performance numbers (40% memory of Kira) are measured against an independent baseline, not generated from the training distribution alone. No self-citations, uniqueness theorems, or ansatzes are invoked to justify the central results. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The method rests on standard assumptions about transformer expressivity and the completeness of IBP identities; no new physical entities or ad-hoc constants are introduced.

axioms (1)

domain assumption A transformer classifier trained on reverse-applied IBP identities can learn to select valid forward reduction steps on unseen integrals.
Core premise of the self-supervised training loop.

pith-pipeline@v0.9.0 · 5571 in / 1155 out tokens · 48716 ms · 2026-05-10T19:26:47.711592+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The classifier is trained in an entirely self-supervised manner on synthetic data generated by a scramble/unscramble procedure: known reduction identities are applied in reverse to build expressions of increasing complexity, and the classifier learns to undo these steps.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Sailir can reduce integrals of arbitrarily high weight with bounded memory

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

An Algorithm for the Symbolic Reduction of Multi-loop Feynman Integrals via Generating Functions
hep-ph 2026-05 unverdicted novelty 7.0

A new generating-function framework turns IBP relations into differential equations in a non-commutative algebra, yielding an iterative algorithm that derives symbolic reduction rules and checks completeness for topol...
SIRENA -- Sum-Integral REductioN Algorithm
hep-ph 2026-05 unverdicted novelty 7.0

SIRENA automates IBP reduction of sum-integrals in finite-temperature QFT, reproduces known results to 3 loops, supplies new 3-loop fermionic reductions, and derives an analytic factorization formula for arbitrary 2-l...
When Does Critique Improve AI-Assisted Theoretical Physics? SCALAR: Structured Critic--Actor Loop for Agentic Reasoning
cs.AI 2026-05 unverdicted novelty 7.0

Structured critic-actor loops improve AI performance on theoretical physics reasoning tasks, with benefits strongest in asymmetric model pairings using constructive feedback.
Feynman integral reduction by covariant differentiation
hep-ph 2026-04 unverdicted novelty 7.0

Covariant differentiation on the dual vector space spanned by master integrals reduces a large class of Feynman integrals to masters, with connections reusable across mass configurations.
A Scientific Human-Agent Reproduction Pipeline
hep-ph 2026-04 unverdicted novelty 6.0

SHARP is a human-AI collaboration pipeline for reproducing scientific analyses, demonstrated by recreating a jet classification task from a particle physics paper.

Reference graph

Works this paper leans on

41 extracted references · 29 canonical work pages · cited by 5 Pith papers · 3 internal anchors

[1]

For each beam state, identify the highest- weight non-master target integral and enu- merate valid actions
[2]

Batch all (state, target, actions) tuples and run the classifier in a single forward pass
[3]

For each state, extract the top-Kactions by model score, apply them in parallel to produce candidate next states
[4]

Select the bestKstates for the next beam according to the sorting criterion. We use amixedbeam sort strategy that maintains two parallel beams of widthKeach: one sorted by maximum integral weight (favoring aggressive weight reduction) and one sorted by total weight (favoring overall simplification). After deduplica- tion, the total number of active states...
[5]

Anorchestratormaintains the global expression and a cache of solved integrals
[6]

3.Worker isolation: Crucially, each one-step worker begins with anemptysubstitution history

For each non-master integral that appears in the expression, the orchestrator submits aone-episode workerjob that reduces the integral’s weight by one level. 3.Worker isolation: Crucially, each one-step worker begins with anemptysubstitution history. The worker does not inherit the accumulated sub- stitutions from the orchestrator or from prior work- ers....
[7]

5.Memoization: When the same integral appears in a subsequent episode, the cached result is reused without recomputation

When a worker completes, its result (the solved in- tegral expressed in terms of lower-weight and/or lower-sector integrals) is cached. 5.Memoization: When the same integral appears in a subsequent episode, the cached result is reused without recomputation. Thismemoizationis crit- ical for efficiency: in our benchmarks, cache hit rates range from 60% to 7...
[8]

The process repeats until all non-master integrals are eliminated. To limit peak memory on each worker, model infer- ence within each beam search step is sub-batched: the (state, target, actions) tuples are processed in groups of 50 through the classifier rather than all at once, pre- venting large intermediate tensors from accumulating. All inference is ...
[9]

all” across all sectors, and “sector

The[CLS]output,[TARGET]output, substitu- tion embedding, and sector embedding are concatenated (4×256 = 1024 dimensions) and projected to the 256- dimensional state vectorh. Action embeddings attend to per-term expression embeddings via 2-layer multi-head 0 5 10 15 20 25 30 Epoch 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0Cross-entropy loss Train Validation Best ...

1942
[10]

Integration by parts: The algorithm to calculateβ-functions in 4 loops,

K. G. Chetyrkin and F. V. Tkachov, “Integration by parts: The algorithm to calculateβ-functions in 4 loops,” Nucl. Phys. B192, 159 (1981)

1981
[11]

A theorem on analytical calculability of 4-loop renormalization group functions,

F. V. Tkachov, “A theorem on analytical calculability of 4-loop renormalization group functions,” Phys. Lett. B 100, 65 (1981)

1981
[12]

Two decades of algorithmic feynman integral reduction

A. Smirnov and V. Smirnov, “Two decades of algorithmic Feynman integral reduction,” (2025), arXiv:2510.10748 [hep-th]

work page arXiv 2025
[13]

High-precision calculation of multi-loop Feynman integrals by difference equations

S. Laporta, “High-precision calculation of multiloop Feynman integrals by difference equations,” Int. J. Mod. Phys. A15, 5087 (2000), arXiv:hep-ph/0102033

work page Pith review arXiv 2000
[14]

Kira - A Feynman Integral Reduction Program

P. Maierh¨ ofer, J. Usovitsch, and P. Uwer, “Kira—A Feynman integral reduction program,” Comput. Phys. Commun.230, 99 (2018), arXiv:1705.05610 [hep-ph]

work page Pith review arXiv 2018
[15]

Klappert, F

J. Klappert, F. Lange, P. Maierh¨ ofer, and J. Uso- vitsch, “Integral reduction with Kira 2.0 and finite field methods,” Comput. Phys. Commun.266, 108024 (2021), arXiv:2008.06494 [hep-ph]

work page arXiv 2021
[16]

Lange, J

F. Lange, J. Usovitsch, and Z. Wu, “Kira 3: Integral reduction with efficient seeding and optimized equation selection,” Comput. Phys. Commun.322, 109999 (2026), arXiv:2505.20197 [hep-ph]

work page arXiv 2026
[17]

Smirnov and F.S

A. V. Smirnov and F. S. Chukharev, “FIRE6: Feynman Integral REduction with modular arithmetic,” Comput. Phys. Commun.247, 106877 (2020), arXiv:1901.07808 [hep-ph]

work page arXiv 2020
[18]

Smirnov and M

A. V. Smirnov and M. Zeng, “FIRE 7: Automatic Reduc- tion with Modular Approach,” (2025), arXiv:2510.07150 [hep-ph]

work page arXiv 2025
[19]

Studerus,Reduze – Feynman integral reduction in C++,Comput

C. Studerus, “Reduze – Feynman integral reduction in C++,” Comput. Phys. Commun.181, 1293 (2010), arXiv:0912.2546 [physics.comp-ph]

work page arXiv 2010
[20]

Reduze 2 - Distributed Feynman Integral Reduction

A. von Manteuffel and C. Studerus, “Reduze 2 - Distributed Feynman Integral Reduction,” (2012), arXiv:1201.4330 [hep-ph]

work page Pith review arXiv 2012
[21]

LiteRed 1.4: a powerful tool for the reduction of the multiloop integrals

R. N. Lee, “LiteRed 1.4: a powerful tool for reduction of multiloop integrals,” J. Phys. Conf. Ser.523, 012059 (2014), arXiv:1310.1145 [hep-ph]

work page Pith review arXiv 2014
[22]

Kant,Finding Linear Dependencies in Integration-By-Parts Equations: A Monte Carlo Approach,Comput

P. Kant, “Finding linear dependencies in integration-by- parts equations: A Monte Carlo approach,” Comput. Phys. Commun.185, 1473 (2014), arXiv:1309.7287 [hep- ph]

work page arXiv 2014
[23]

A novel approach to integration by parts reduction

A. von Manteuffel and R. M. Schabinger, “A novel ap- proach to integration by parts reduction,” Phys. Lett. B 744, 101 (2015), arXiv:1406.4513 [hep-ph]

work page Pith review arXiv 2015
[24]

Peraro,FiniteFlow: multivariate functional reconstruction using finite fields and dataflow graphs,JHEP07(2019) 031 [1905.08019]

T. Peraro, “FiniteFlow: multivariate functional recon- struction using finite fields and dataflow graphs,” JHEP 07, 031 (2019), arXiv:1905.08019 [hep-ph]

work page arXiv 2019
[25]

Klappert and F

J. Klappert and F. Lange, “Reconstructing rational functions with FireFly,” Comput. Phys. Commun.247, 106951 (2020), arXiv:1904.00009 [cs.SC]

work page arXiv 2020
[26]

NeatIBP 1.0, a package generating small- size integration-by-parts relations for Feynman inte- grals,

Z. Wu, J. Boehm, R. Ma, H. Xu, and Y. Zhang, “Neat- IBP 1.0, a package generating small-size integration-by- parts relations for Feynman integrals,” Comput. Phys. Commun.295, 108999 (2024), arXiv:2305.08783 [hep- ph]

work page arXiv 2024
[27]

Blade: A package for block-triangular form improved Feynman integrals decomposition,

X. Guan, X. Liu, Y.-Q. Ma, and W.-H. Wu, “Blade: A package for block-triangular form improved Feynman integrals decomposition,” Comput. Phys. Commun.310, 109538 (2025), arXiv:2405.14621 [hep-ph]. 16

work page arXiv 2025
[28]

Direct Solution of Integration-by- Parts Systems,

D. A. Kosower, “Direct Solution of Integration-by- Parts Systems,” Phys. Rev. D98, 025008 (2018), arXiv:1804.00131 [hep-ph]

work page arXiv 2018
[29]

Symbolic reduction of multi-loop feynman integrals via generating functions

B. Feng, X. Li, Y. Liu, Y.-Q. Ma, and Y. Zhang, “Sym- bolic Reduction of Multi-loop Feynman Integrals via Generating Functions,” (2025), arXiv:2509.21769 [hep- ph]

work page arXiv 2025
[30]

Liu and Alexander Mitov

J. W. Liu and A. Mitov, “Untangling the IBP Equa- tions,” (2025), arXiv:2512.05923 [hep-ph]

work page arXiv 2025
[31]

de la Cruz and D

L. de la Cruz and D. A. Kosower, “Seedless Reduction of Feynman Integrals,” (2026), arXiv:2602.22111 [hep-ph]

work page arXiv 2026
[32]

Smith and M

S. Smith and M. Zeng, “Feynman integral reduction us- ing syzygy-constrained symbolic reduction rules,” JHEP 01, 102 (2026), arXiv:2507.11140 [hep-ph]

work page arXiv 2026
[33]

von Hippel and M

M. von Hippel and M. Wilhelm, “Refining Integration- by-Parts Reduction of Feynman Integrals with Machine Learning,” JHEP05, 185 (2025), arXiv:2502.05121 [hep- th]

work page arXiv 2025
[34]

Song, T.-Z

Z.-Y. Song, T.-Z. Yang, Q.-H. Cao, M.-x. Luo, and H. X. Zhu, “Explainable AI-assisted Optimization for Feynman Integral Reduction,” (2025), arXiv:2502.09544 [hep-ph]

work page arXiv 2025
[35]

Zeng, Phys

M. Zeng, “Reinforcement Learning and Metaheuristics for Feynman Integral Reduction,” Phys. Rev. D (2025), 10.1103/dmlf-jkfc, arXiv:2504.16045 [hep-ph]

work page doi:10.1103/dmlf-jkfc 2025
[36]

Math- ematical discoveries from program search with large lan- guage models,

B. Romera-Paredes, M. Barekatain, A. Novikov, M. Ba- log, M. P. Kumar, E. Dupont, F. J. R. Ruiz, J. S. Ellen- berg, P. Wang, O. Fawzi, P. Kohli, and A. Fawzi, “Math- ematical discoveries from program search with large lan- guage models,” Nature (London)625, 468 (2024)

2024
[37]

Learning to Unscramble: Simplifying Symbolic Expressions via Self-Supervised Oracle Trajectories

D. Shih, “Learning to Unscramble: Simplifying Sym- bolic Expressions via Self-Supervised Oracle Trajecto- ries,” (2026), arXiv:2603.11164 [hep-th]

work page internal anchor Pith review Pith/arXiv arXiv 2026
[38]

Poly-encoders: Transformer architectures and pre-training strategies for fast and accurate multi-sentence scoring

S. Humeau, K. Shuster, M.-A. Lachaux, and J. We- ston, “Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi- sentence Scoring,” inInternational Conference on Learn- ing Representations(2020) arXiv:1905.01969 [cs.CL]

work page arXiv 2020
[39]

Feynman Diagrams and Differential Equations,

M. Argeri and P. Mastrolia, “Feynman Diagrams and Differential Equations,” Int. J. Mod. Phys. A22, 4375 (2007), arXiv:0707.4037 [hep-ph]

work page arXiv 2007
[40]

Passage Re-ranking with BERT

R. Nogueira and K. Cho, “Passage Re-ranking with BERT,” (2019), arXiv:1901.04085 [cs.IR]

work page internal anchor Pith review arXiv 2019
[41]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” (2018), arXiv:1810.04805 [cs.CL]

work page internal anchor Pith review Pith/arXiv arXiv 2018