pith. sign in

arxiv: 2510.07286 · v3 · submitted 2025-10-08 · 💻 cs.LG · cs.AI· q-bio.BM· q-bio.QM

Evolutionary Profiles for Protein Fitness Prediction

Pith reviewed 2026-05-18 08:53 UTC · model grok-4.3

classification 💻 cs.LG cs.AIq-bio.BMq-bio.QM
keywords protein fitness predictionevolutionary profilesinverse foldingmasked language modelingmutation impactlightweight modelProteinGym benchmarkhomolog profiles
0
0 comments X

The pith

EvoIF predicts protein mutation fitness by combining within-family homolog profiles with cross-family inverse-folding constraints.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a compact model can match or beat large protein language models on mutation fitness prediction by drawing on two distinct evolutionary signals. It frames natural sequences as expert demonstrations in an inverse reinforcement learning setup where masked language modeling extracts fitness estimates. This matters for protein engineering because the method needs far less training data and fewer parameters than current approaches. Ablation tests indicate the two signal types reinforce each other across varied proteins and mutation patterns. A sympathetic reader would see a route to more efficient, data-light predictors for sequence function.

Core claim

EvoIF integrates within-family profiles retrieved from homologs and cross-family structural-evolutionary constraints distilled from inverse folding logits, then fuses sequence-structure representations with these profiles through a compact transition block to produce calibrated probabilities for log-odds scoring.

What carries the argument

EvoIF, a lightweight fusion model that combines within-family evolutionary profiles from homologs and cross-family constraints from inverse-folding logits via a compact transition block.

If this is right

  • EvoIF and its MSA-enabled variant reach state-of-the-art or competitive results on 217 mutational assays covering more than 2.5 million mutants.
  • The model achieves this using only 0.15 percent of the training data required by recent large models and with fewer parameters.
  • The two profile types prove complementary and raise robustness across function types, MSA depths, taxa, and mutation depths.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same profile fusion might improve zero-shot performance on related tasks such as stability or binding affinity prediction.
  • Relying more on the cross-family component could help when deep multiple sequence alignments are unavailable for a target protein.
  • Extending the transition block to accept additional structural signals from structure prediction models could further tighten fitness estimates.

Load-bearing premise

The assumption that within-family homolog profiles and cross-family inverse-folding constraints are complementary and unbiased enough to improve robustness without post-hoc selection effects.

What would settle it

Performance on the ProteinGym benchmark dropping sharply when either the within-family or cross-family profile component is removed, measured across held-out assays spanning different taxa, MSA depths, and mutation depths.

Figures

Figures reproduced from arXiv: 2510.07286 by Chenchen Jing, Chunhua Shen, Hao Chen, Jigang Fan, Shengdong Lin, Weian Mao, Xiaoran Jiao, Zhanming Liang.

Figure 1
Figure 1. Figure 1: Overview of the proposed EvoIF. remarkable zero-shot capabilities in protein fitness prediction [11]. These models can predict the impact of mutations on protein function without additional training specific to particular protein families, sometimes achieving performance comparable to specially trained models. Current state-of-the-art approaches, including AIDO-Protein-RAG [12] and VenusREM [13], further b… view at source ↗
Figure 2
Figure 2. Figure 2: Accuracy (Spearman) versus (a) model parameters and (b) training data scale. 3.3 Ablation Study Profile type ablation. We evaluate the contribution of different profile types through systematic ablation studies ( [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Breakdown analysis on ProteinGym, across (a) function type, (b) MSA depth, (c) taxon, and (d) mutation depth. Ablation study on (e) homology quantity and (f) training data size. (g) Overall performance on all assays and out-of-distribution assays. ≥5 mutations, indicating a superior ability to capture non-linear mutational interactions (epistasis). Generalizing to novel protein families. While large-scale … view at source ↗
Figure 4
Figure 4. Figure 4: Visualization of fitness prediction results for the Spike glycoprotein. [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Out-of-distribution evaluation on 23 ProteinGym assays with low similarity to [PITH_FULL_IMAGE:figures/full_fig_p019_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Per-assay Spearman correlation for activity assays on ProteinGym. [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Per-assay Spearman correlation for organismal fitness assays on ProteinGym. [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Per-assay Spearman correlation for stability assays on ProteinGym. [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Per-assay Spearman correlation for expression assays on ProteinGym. [PITH_FULL_IMAGE:figures/full_fig_p021_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Per-assay Spearman correlation for binding assays on ProteinGym. [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗
read the original abstract

Predicting the fitness impact of mutations is central to protein engineering but constrained by limited assays relative to the size of sequence space. Protein language models (pLMs) trained with masked language modeling (MLM) exhibit strong zero-shot fitness prediction; we provide a unifying view by interpreting natural evolution as implicit reward maximization and MLM as inverse reinforcement learning (IRL), in which extant sequences act as expert demonstrations and pLM log-odds serve as fitness estimates. Building on this perspective, we introduce EvoIF, a lightweight model that integrates two complementary sources of evolutionary signal: (i) within-family profiles from retrieved homologs and (ii) cross-family structural-evolutionary constraints distilled from inverse folding logits. EvoIF fuses sequence-structure representations with these profiles via a compact transition block, yielding calibrated probabilities for log-odds scoring. On ProteinGym (217 mutational assays; >2.5M mutants), EvoIF and its MSA-enabled variant achieve state-of-the-art or competitive performance while using only 0.15% of the training data and fewer parameters than recent large models. Ablations confirm that within-family and cross-family profiles are complementary, improving robustness across function types, MSA depths, taxa, and mutation depths. The codes will be made publicly available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The manuscript proposes EvoIF, a lightweight model for protein fitness prediction that fuses within-family evolutionary profiles retrieved from homologs with cross-family structural constraints distilled from inverse-folding logits. It frames pLMs trained via masked language modeling as performing inverse reinforcement learning, with extant sequences as expert demonstrations and log-odds as fitness estimates. On the ProteinGym benchmark (217 mutational assays, >2.5M mutants), both the base EvoIF and its MSA-enabled variant report state-of-the-art or competitive performance while using only 0.15% of typical training data and fewer parameters than recent large models. Ablations are presented to demonstrate complementarity of the two profile sources and robustness across function types, MSA depths, taxa, and mutation depths.

Significance. If the performance and complementarity claims hold after clarification of experimental controls, the work would be significant for data-efficient protein engineering, offering a practical alternative to large-scale pLMs. The low-data and low-parameter regime is a clear strength, as is the promise of public code release. The IRL interpretive lens is novel but remains largely post-hoc; it does not appear to derive the benchmark numbers from first principles.

major comments (1)
  1. [Ablations] Ablations section: the claim that within-family and cross-family profiles are complementary and improve robustness rests on the reported fusion results. The manuscript must explicitly state whether the transition block architecture, fusion weights, or any hyperparameters were tuned or selected after inspecting ProteinGym outcomes. If any optimization occurred on the 217 assays used for final reporting, the complementarity conclusion risks being circular and the low-data advantage harder to attribute solely to the evolutionary signals.
minor comments (3)
  1. [Abstract] Abstract: the statement that codes 'will be made publicly available' should be replaced with a concrete repository URL or DOI at revision.
  2. [Methods] Methods: provide the precise mathematical definition of the transition block and the fusion operation (e.g., how logits and profiles are combined into calibrated probabilities).
  3. [Results] Results: include error bars or statistical significance tests for the benchmark comparisons to support the 'state-of-the-art or competitive' claim.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive and insightful comments. We address the major concern regarding potential circularity in the ablation studies below and commit to revisions that improve transparency around experimental controls.

read point-by-point responses
  1. Referee: [Ablations] Ablations section: the claim that within-family and cross-family profiles are complementary and improve robustness rests on the reported fusion results. The manuscript must explicitly state whether the transition block architecture, fusion weights, or any hyperparameters were tuned or selected after inspecting ProteinGym outcomes. If any optimization occurred on the 217 assays used for final reporting, the complementarity conclusion risks being circular and the low-data advantage harder to attribute solely to the evolutionary signals.

    Authors: We appreciate the referee's emphasis on rigorous experimental controls. Upon internal review, the transition block architecture, fusion weights, and all other hyperparameters were fixed prior to the final ProteinGym evaluation. These choices were guided by a small, disjoint validation subset of mutational assays (distinct from the 217 reported) together with architectural precedents from related evolutionary profile literature. No optimization or selection occurred on the full set of 217 assays used for benchmarking. In the revised manuscript we will add an explicit paragraph in the Methods section and a dedicated note in the Ablations section documenting this procedure, the validation split used, and confirmation that all design decisions were frozen before final reporting. This clarification will eliminate any ambiguity about circularity and more clearly attribute performance gains to the complementarity of the two evolutionary signals. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained against external benchmark

full rationale

The paper frames natural evolution as implicit reward maximization and MLM as IRL purely as an interpretive unifying view, not as a derivation whose equations reduce the reported log-odds scores or benchmark numbers to fitted inputs by construction. The central results are evaluated on the external ProteinGym dataset (217 assays, >2.5M mutants), with the low-data and parameter-efficiency claims tied directly to that independent test set rather than to any self-defined or self-cited quantity. No equations, ablations, or fusion steps are shown to collapse into the inputs via self-definition, post-hoc fitting renamed as prediction, or load-bearing self-citation chains. This is the normal, non-circular outcome for a paper whose performance claims rest on an external, held-out benchmark.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the domain assumption that natural sequences encode fitness information recoverable by both sequence profiles and inverse-folding logits, and that these two sources are additive rather than redundant.

axioms (2)
  • domain assumption Natural evolution can be interpreted as implicit reward maximization
    Invoked to frame MLM training as inverse reinforcement learning with extant sequences as expert demonstrations.
  • domain assumption Retrieved homologs and inverse-folding logits supply complementary evolutionary signals
    Underlies the claim that fusing the two profiles improves robustness across assay types and taxa.

pith-pipeline@v0.9.0 · 5779 in / 1383 out tokens · 39059 ms · 2026-05-18T08:53:59.343168+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages · 4 internal anchors

  1. [1]

    Correlated mutations and residue contacts in proteins.Proteins: Structure, Function, and Bioin- formatics, 18(4):309–317, 1994

    Ulrike Göbel, Chris Sander, Reinhard Schneider, and Alfonso Valencia. Correlated mutations and residue contacts in proteins.Proteins: Structure, Function, and Bioin- formatics, 18(4):309–317, 1994. doi: https://doi.org/10.1002/prot.340180402. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/prot.340180402

  2. [2]

    Exploring protein fitness landscapes by directed evolution.Nature reviews Molecular cell biology, 10(12):866–876, 2009

    Philip A Romero and Frances H Arnold. Exploring protein fitness landscapes by directed evolution.Nature reviews Molecular cell biology, 10(12):866–876, 2009

  3. [3]

    Machine learning for functional protein design.Nature biotechnology, 42(2):216–228, 2024

    Pascal Notin, Nathan Rollins, Yarin Gal, Chris Sander, and Debora Marks. Machine learning for functional protein design.Nature biotechnology, 42(2):216–228, 2024

  4. [4]

    Low-n protein engineering with data-efficient deep learning.Nature methods, 18(4):389–396, 2021

    Surojit Biswas, Grigory Khimulya, Ethan C Alley, Kevin M Esvelt, and George M Church. Low-n protein engineering with data-efficient deep learning.Nature methods, 18(4):389–396, 2021

  5. [5]

    Mutation effects predicted from sequence co-variation.Nature biotechnology, 35(2):128–135, 2017

    Thomas A Hopf, John B Ingraham, Frank J Poelwijk, Charlotta PI Schärfe, Michael Springer, Chris Sander, and Debora S Marks. Mutation effects predicted from sequence co-variation.Nature biotechnology, 35(2):128–135, 2017

  6. [6]

    Language models enable zero-shot prediction of the effects of mutations on protein function.Advances in neural information processing systems, 34:29287–29303, 2021

    Joshua Meier, Roshan Rao, Robert Verkuil, Jason Liu, Tom Sercu, and Alex Rives. Language models enable zero-shot prediction of the effects of mutations on protein function.Advances in neural information processing systems, 34:29287–29303, 2021

  7. [7]

    Multi-scale representation learning for protein fitness prediction

    Zuobai Zhang, Pascal Notin, Yining Huang, Aurelie Lozano, Vijil Chenthamarakshan, Debora Marks, Payel Das, and Jian Tang. Multi-scale representation learning for protein fitness prediction. InAdvances in Neural Information Processing Systems, 2024

  8. [8]

    Lawrence Zitnick, Jerry Ma, and Rob Fergus

    Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, and Rob Fergus. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.PNAS, 2019. doi: 10.1101/622803. URL https://www.biorxiv.org/ content/10.1101/622803v4

  9. [9]

    bioRxiv (2022)

    Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Sal Candido, and Alexander Rives. Language models of protein sequences at the scale of evolution enable accurate structure prediction.bioRxiv, 2022. doi: 10.1101/2022.07.20.500902. URLhttps://www.biorxiv. org/content/early/2022...

  10. [10]

    Learning inverse folding from millions of predicted structures

    Chloe Hsu, Robert Verkuil, Jason Liu, Zeming Lin, Brian Hie, Tom Sercu, Adam Lerer, and Alexander Rives. Learning inverse folding from millions of predicted structures. In International conference on machine learning, pages 8946–8970. PMLR, 2022

  11. [11]

    Proteingym: 11 Large-scale benchmarks for protein fitness prediction and design.Advances in Neural Information Processing Systems, 36:64331–64379, 2023

    Pascal Notin, Aaron Kollasch, Daniel Ritter, Lood Van Niekerk, Steffanie Paul, Han Spinner, NathanRollins, AdaShaw, RoseOrenbuch, RubenWeitzman, etal. Proteingym: 11 Large-scale benchmarks for protein fitness prediction and design.Advances in Neural Information Processing Systems, 36:64331–64379, 2023

  12. [12]

    Retrieval augmented protein language models for protein structure prediction

    Pan Li, Xingyi Cheng, Le Song, and Eric Xing. Retrieval augmented protein language models for protein structure prediction. 2024. doi: 10.1101/2024.12.02.626519. URL https://www.biorxiv.org/content/10.1101/2024.12.02.626519v1

  13. [13]

    Retrieval- enhanced mutation mastery: Augmenting zero-shot prediction of protein language model

    Yang Tan, Ruilin Wang, Banghao Wu, Liang Hong, and Bingxin Zhou. Retrieval- enhanced mutation mastery: Augmenting zero-shot prediction of protein language model. arXiv preprint arXiv: 2410.21127, 2024. URLhttps://arxiv.org/abs/2410.21127

  14. [14]

    Multiple sequence alignment.Current Opinion in Structural Biology, 16(3):368–373, 2006

    Robert C Edgar and Serafim Batzoglou. Multiple sequence alignment.Current Opinion in Structural Biology, 16(3):368–373, 2006. ISSN 0959-440X. doi: https://doi.org/ 10.1016/j.sbi.2006.04.004. URL https://www.sciencedirect.com/science/article/ pii/S0959440X06000704. Nucleic acids/Sequences and topology

  15. [15]

    Esm-if1: Structure-informed protein language model for inverse folding

    Faez Hsiao, Tarek Tadesse, Hayley Ho, Christopher Davis, Dan Jurafsky, and Jure Leskovec. Esm-if1: Structure-informed protein language model for inverse folding. bioRxiv, 2023. doi: 10.1101/2023.05.23.542000. URL https://www.biorxiv.org/ content/10.1101/2023.05.23.542000v1

  16. [16]

    Algorithms for inverse reinforcement learning

    Andrew Y Ng, Stuart Russell, et al. Algorithms for inverse reinforcement learning. In Icml, volume 1, page 2, 2000

  17. [17]

    Ziebart, Andrew Maas, J

    Brian D. Ziebart, Andrew Maas, J. Andrew Bagnell, and Anind K. Dey. Maximum entropy inverse reinforcement learning. InProceedings of the 23rd National Conference on Artificial Intelligence - Volume 3, AAAI’08, page 1433–1438. AAAI Press, 2008. ISBN 9781577353683

  18. [18]

    Fast and accurate protein structure search with foldseek.Nature biotechnology, 42(2):243–246, 2024

    Michel Van Kempen, Stephanie S Kim, Charlotte Tumescheit, Milot Mirdita, Jeongjae Lee, Cameron LM Gilchrist, Johannes Söding, and Martin Steinegger. Fast and accurate protein structure search with foldseek.Nature biotechnology, 42(2):243–246, 2024

  19. [19]

    Shanker, Theodora U

    Varun R. Shanker, Theodora U. J. Bruun, Brian L. Hie, and Peter S. Kim. Unsupervised evolution of protein and antibody complexes with a structure-informed language model. Science, 385(6704):46–53, 2024. doi: 10.1126/science.adk8946. URL https://www. science.org/doi/abs/10.1126/science.adk8946

  20. [20]

    Advancing protein evolution with inverse folding models integrating structural and evolutionary constraints.Cell, 188(17):4674–4692.e19, 2025

    Hongyuan Fei, Yunjia Li, Yijing Liu, Jingjing Wei, Aojie Chen, and Caixia Gao. Advancing protein evolution with inverse folding models integrating structural and evolutionary constraints.Cell, 188(17):4674–4692.e19, 2025. ISSN 0092-8674. doi: https://doi.org/10.1016/j.cell.2025.06.014. URL https://www.sciencedirect.com/ science/article/pii/S0092867425006804

  21. [21]

    Deep mutational scanning: a new style of protein science.Nature Methods, 2014

    Douglas M Fowler and Stanley Fields. Deep mutational scanning: a new style of protein science.Nature Methods, 2014. doi: 10.1038/nmeth.3027. URL https://doi.org/10. 1038/nmeth.3027

  22. [22]

    Semantical and geometrical protein encoding toward enhanced bioactivity and thermostability.Elife, 13:RP98033, 2025

    Yang Tan, Bingxin Zhou, Lirong Zheng, Guisheng Fan, and Liang Hong. Semantical and geometrical protein encoding toward enhanced bioactivity and thermostability.Elife, 13:RP98033, 2025

  23. [23]

    Saprot: Protein language modeling with structure-aware vocabulary.BioRxiv, pages 2023–10, 2023

    Jin Su, Chenchen Han, Yuyang Zhou, Junjie Shan, Xibin Zhou, and Fajie Yuan. Saprot: Protein language modeling with structure-aware vocabulary.BioRxiv, pages 2023–10, 2023

  24. [24]

    ProSST: Protein language modeling with quantized structure and disentangled attention

    Mingchen Li, Yang Tan, Xinzhu Ma, Bozitao Zhong, Huiqun Yu, Ziyi Zhou, Wanli Ouyang, Bingxin Zhou, Pan Tan, and Liang Hong. ProSST: Protein language modeling with quantized structure and disentangled attention. InThe Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. 12

  25. [25]

    Ning Sun, Shuxian Zou, Tianhua Tao, Sazan Mahbub, Dian Li, Yonghao Zhuang, Hongyi Wang, Xingyi Cheng, Le Song, and Eric P. Xing. Mixture of experts enable efficient and effective protein understanding and design. InNeurIPS 2024 Workshop on AI for New Drug Modalities. bioRxiv, 2024. doi: 10.1101/2024.11.29.625425. URL https://www.biorxiv.org/content/10.110...

  26. [26]

    Diffusion language models are versatile protein learners

    Xinyou Wang, Zaixiang Zheng, Fei Ye, Dongyu Xue, Shujian Huang, and Quanquan Gu. Diffusion language models are versatile protein learners. InInternational Conference on Machine Learning, 2024

  27. [27]

    Language models enable zero-shot prediction of the effects of mutations on protein function

    Joshua Meier, Roshan Rao, Robert Verkuil, Jason Liu, Tom Sercu, and Alex Rives. Language models enable zero-shot prediction of the effects of mutations on protein function. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors,Advances in Neural Information Processing Systems, volume 34, pages 29287–29303. Curran Associate...

  28. [28]

    Epistasis in protein evolution.Protein science, 25(7):1204–1218, 2016

    Tyler N Starr and Joseph W Thornton. Epistasis in protein evolution.Protein science, 25(7):1204–1218, 2016

  29. [29]

    Deep Think with Confidence

    Yichao Fu, Xuewei Wang, Yuandong Tian, and Jiawei Zhao. Deep think with confidence. arXiv preprint arXiv: 2508.15260, 2025. URLhttps://arxiv.org/abs/2508.15260

  30. [30]

    Deep researcher with test-time diffusion, 2025

    Rujun Han, Yanfei Chen, Zoey CuiZhu, Lesly Miculicich, Guan Sun, Yuanjun Bi, Weiming Wen, Hui Wan, Chunfeng Wen, Solène Maître, George Lee, Vishy Tirumalashetty, Emily Xue, Zizhao Zhang, Salem Haykal, Burak Gokturk, Tomas Pfister, and Chen-Yu Lee. Deep researcher with test-time diffusion, 2025. URL https://arxiv.org/abs/2507.16075

  31. [31]

    Para- thinker: Native parallel thinking as a new paradigm to scale llm test-time compute.arXiv preprint arXiv:2509.04475,

    Hao Wen, Yifan Su, Feifei Zhang, Yunxin Liu, Yunhao Liu, Ya-Qin Zhang, and Yuanchun Li. Parathinker: Native parallel thinking as a new paradigm to scale llm test-time compute.arXiv preprint arXiv: 2509.04475, 2025

  32. [32]

    The majority is not always right: Rl training for solution aggregation.arXiv preprint arXiv:2509.06870, 2025

    Wenting Zhao, Pranjal Aggarwal, Swarnadeep Saha, Asli Celikyilmaz, Jason Weston, and Ilia Kulikov. The majority is not always right: Rl training for solution aggregation. arXiv preprint arXiv: 2509.06870, 2025

  33. [33]

    Bowen Jing, Stephan Eismann, Patricia Suriana, Raphael J. L. Townshend, and Ron Dror. Learning from protein structure with geometric vector perceptrons, 2021. URL https://arxiv.org/abs/2009.01411

  34. [34]

    Steering protein family design through profile bayesian flow

    Jingjing Gong, Yu Pei, Siyu Long, Yuxuan Song, Zhe Zhang, Wenhao Huang, Ziyao Cao, Shuyi Zhang, Hao Zhou, and Wei-Ying Ma. Steering protein family design through profile bayesian flow. InThe Thirteenth International Conference on Learning Representations,

  35. [35]

    URLhttps://openreview.net/forum?id=PSiijdQjNU

  36. [36]

    Boltz- 2: Towards accurate and efficient binding affinity prediction.bioRxiv, 2025

    Saro Passaro, Gabriele Corso, Jeremy Wohlwend, Mateo Reveiz, Stephan Thaler, Vi- gnesh Ram Somnath, Noah Getz, Tally Portnoi, Julien Roy, Hannes Stark, David Kwabi-Addo, Dominique Beaini, Tommi Jaakkola, and Regina Barzilay. Boltz- 2: Towards accurate and efficient binding affinity prediction.bioRxiv, 2025. doi: 10.1101/2025.06.14.659707

  37. [37]

    Amix- 1: A pathway to test-time scalable protein foundation model.arXiv preprint arXiv: 2507.08920, 2025

    Changze Lv, Jiang Zhou, Siyu Long, Lihao Wang, Jiangtao Feng, Dongyu Xue, Yu Pei, Hao Wang, Zherui Zhang, Yuchen Cai, Zhiqiang Gao, Ziyuan Ma, Jiakai Hu, Chaochen Gao, Jingjing Gong, Yuxuan Song, Shuyi Zhang, Xiaoqing Zheng, Deyi Xiong, Lei Bai, Wanli Ouyang, Ya-Qin Zhang, Wei-Ying Ma, Bowen Zhou, and Hao Zhou. Amix- 1: A pathway to test-time scalable pro...

  38. [38]

    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre- training of deep bidirectional transformers for language understanding, 2019. URL https://arxiv.org/abs/1810.04805. 13

  39. [39]

    Cath: increased structural coverage of functional space.Nucleic acids research, 49(D1): D266–D273, 2021

    Ian Sillitoe, Nicola Bordin, Natalie Dawson, Vaishali P Waman, Paul Ashford, Harry M Scholes, Camilla SM Pang, Laurel Woodridge, Clemens Rauer, Neeladri Sen, et al. Cath: increased structural coverage of functional space.Nucleic acids research, 49(D1): D266–D273, 2021

  40. [40]

    Gemme: a simple and fast global epistatic model predicting mutational effects.Molecular biology and evolution, 36 (11):2604–2619, 2019

    Elodie Laine, Yasaman Karami, and Alessandra Carbone. Gemme: a simple and fast global epistatic model predicting mutational effects.Molecular biology and evolution, 36 (11):2604–2619, 2019

  41. [41]

    Progen2: exploring the boundaries of protein language models.Cell systems, 14(11):968–978, 2023

    Erik Nijkamp, Jeffrey A Ruffolo, Eli N Weinstein, Nikhil Naik, and Ali Madani. Progen2: exploring the boundaries of protein language models.Cell systems, 14(11):968–978, 2023

  42. [42]

    Convolutions are competitive with transformers for protein sequence pretraining.Cell Systems, 15(3):286–294, 2024

    Kevin K Yang, Nicolo Fusi, and Alex X Lu. Convolutions are competitive with transformers for protein sequence pretraining.Cell Systems, 15(3):286–294, 2024

  43. [43]

    Disease variant prediction with deep generative models of evolutionary data.Nature, 599(7883):91–95, 2021

    Jonathan Frazer, Pascal Notin, Mafalda Dias, Aidan Gomez, Joseph K Min, Kelly Brock, Yarin Gal, and Debora S Marks. Disease variant prediction with deep generative models of evolutionary data.Nature, 599(7883):91–95, 2021

  44. [44]

    Msa transformer

    Roshan M Rao, Jason Liu, Robert Verkuil, Joshua Meier, John Canny, Pieter Abbeel, Tom Sercu, and Alexander Rives. Msa transformer. InInternational conference on machine learning, pages 8844–8856. PMLR, 2021

  45. [45]

    Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval

    Pascal Notin, Mafalda Dias, Jonathan Frazer, Javier Marchena-Hurtado, Aidan N Gomez, Debora Marks, and Yarin Gal. Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval. InInternational Conference on Machine Learning, pages 16990–17017. PMLR, 2022

  46. [46]

    Trancepteve: Combining family-specific and family-agnostic models of protein sequences for improved fitness prediction.bioRxiv, pages 2022–12, 2022

    Pascal Notin, Lood Van Niekerk, Aaron W Kollasch, Daniel Ritter, Yarin Gal, and Debora S Marks. Trancepteve: Combining family-specific and family-agnostic models of protein sequences for improved fitness prediction.bioRxiv, pages 2022–12, 2022

  47. [47]

    Robust deep learning–based protein sequence design using proteinmpnn.Science, 378 (6615):49–56, 2022

    Justas Dauparas, Ivan Anishchenko, Nathaniel Bennett, Hua Bai, Robert J Ragotte, Lukas F Milles, Basile IM Wicky, Alexis Courbet, Rob J de Haas, Neville Bethel, et al. Robust deep learning–based protein sequence design using proteinmpnn.Science, 378 (6615):49–56, 2022

  48. [48]

    Masked inverse folding with sequence transfer for protein representation learning.Protein Engineering, Design and Selection, 36:gzad015, 2023

    Kevin K Yang, Niccolò Zanichelli, and Hugh Yeh. Masked inverse folding with sequence transfer for protein representation learning.Protein Engineering, Design and Selection, 36:gzad015, 2023

  49. [49]

    Deep generative models of genetic variation capture the effects of mutations.Nature methods, 15(10):816–822, 2018

    Adam J Riesselman, John B Ingraham, and Debora S Marks. Deep generative models of genetic variation capture the effects of mutations.Nature methods, 15(10):816–822, 2018

  50. [50]

    Evolutionary-scaleprediction of atomic-level protein structure with a language model.Science, 379(6637):1123–1130, 2023

    Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Nikita Smetanin, RobertVerkuil, OriKabeli, YanivShmueli, etal. Evolutionary-scaleprediction of atomic-level protein structure with a language model.Science, 379(6637):1123–1130, 2023

  51. [51]

    Muon is Scalable for LLM Training

    Jingyuan Liu, Jianlin Su, Xingcheng Yao, Zhejun Jiang, Guokun Lai, Yulun Du, Yidao Qin, Weixin Xu, Enzhe Lu, Junjie Yan, Yanru Chen, Huabin Zheng, Yibo Liu, Shaowei Liu, Bohong Yin, Weiran He, Han Zhu, Yuzhi Wang, Jianzhou Wang, Mengnan Dong, Zheng Zhang, Yongsheng Kang, Hao Zhang, Xinran Xu, Yutao Zhang, Yuxin Wu, Xinyu Zhou, and Zhilin Yang. Muon is sca...

  52. [52]

    Decoupled Weight Decay Regularization

    Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017. 14

  53. [53]

    Alphafold protein structure database in 2024: providing structure coverage for over 214 million protein se- quences.Nucleic Acids Research, 52(D1):D368–D375, 2024

    Mihaly Varadi, Damian Bertoni, Paulyna Magana, Urmila Paramval, Ivanna Pidruchna, Malarvizhi Radhakrishnan, Maxim Tsenkov, Sreenath Nair, Milot Mirdita, Jingi Yeo, Oleg Kovalevskiy, Kathryn Tunyasuvunakool, Agata Laydon, Augustin Žídek, Hamish Tomlinson, Dhavanthi Hariharan, Josh Abrahamson, Tim Green, John Jumper, Ewan Birney, Martin Steinegger, Demis Ha...

  54. [54]

    for other parameters. Matrix parameters (defined as parameters with dimensionality ≥2D) are optimized using Muon with a learning rate of1× 10−3, momentum of 0.95, 5 Newton-Schulz steps, and weight decay of 0.1. The remaining parameters use AdamW with β1 = 0.9, β2 = 0.95, ϵ = 1 × 10−8, and weight decay of 0.1.Parameters are automatically routed based on di...