Protein Thoughts: Interpretable Reasoning with Tree of Thoughts and Embedding-Space Flow Matching for Protein-Protein Interaction Discovery
Pith reviewed 2026-05-22 02:05 UTC · model grok-4.3
The pith
Protein Thoughts achieves mean best-binder rank of 11.2 on SHS148k by preserving four biological signals in a transparent value function and guiding Tree-of-Thoughts search.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that reformulating PPI discovery as an interpretable search problem, with binding evidence decomposed into four biologically meaningful signals kept separate in a transparent value function, and navigated by hypothesis-guided entropy-regularized Tree-of-Thoughts search plus embedding-space flow matching for score disagreements, produces both stronger ranking performance and auditable predictions on the SHS148k benchmark.
What carries the argument
The transparent value function that preserves separate contributions from sequence similarity, structural complementarity, interface balance, and chemical compatibility while guiding an entropy-regularized Tree-of-Thoughts policy.
If this is right
- True binders appear at mean rank 11.2 instead of 47.7, a 76 percent improvement over entropic tree search.
- Binding prediction reaches 91.08 plus or minus 0.19 Micro-F1, outperforming prior PPI methods on the same dataset.
- Each prediction carries an explicit trace of which biological signals contributed, allowing direct inspection.
- Large candidate spaces are traversed efficiently by classifying proteins as high-priority, exploratory, or skippable and by pruning low-value branches.
- Embedding flow matching resolves cases where signals disagree by transporting embeddings toward the binder manifold.
Where Pith is reading between the lines
- The explicit reasoning traces could be used to generate targeted experimental hypotheses about which residues or structural features drive a particular interaction.
- The same decomposition might be applied to related tasks such as predicting protein-small molecule or protein-DNA interactions if the signals generalize.
- Combining the value function with new structural models could strengthen the geometric complementarity signal without retraining the entire system.
- The framework invites tests on datasets containing many known non-binders to check whether the signals avoid learning spurious patterns.
Load-bearing premise
The four signals can be preserved in a transparent value function that reflects genuine biochemical insight rather than spurious correlations learned from the benchmark.
What would settle it
If the top-ranked predictions fail to validate in independent wet-lab experiments at rates higher than baseline methods, or if the individual signal contributions do not align with established biochemical mechanisms for well-studied protein pairs.
Figures
read the original abstract
Protein-protein interactions (PPIs) govern nearly all cellular processes, yet computational methods for identifying binding partners typically produce ranked predictions without mechanistic justification. This creates a fundamental barrier to adoption because biologists cannot assess whether predictions reflect genuine biochemical insight or spurious correlations. We present \textbf{Protein Thoughts}, a framework that reformulates PPI discovery as an interpretable search problem with explicit reasoning. The system decomposes binding evidence into four biologically meaningful signals: sequence similarity reflecting evolutionary relationships, structural complementarity capturing geometric fit, interface balance, and chemical compatibility encoding residue-level interactions. Rather than collapsing these signals into an opaque score, we preserve their individual contributions through a transparent value function that enables both ranking and auditing. To navigate large candidate spaces efficiently, we introduce hypothesis-guided entropy-regularized Tree-of-Thoughts search. A fine-tuned language model generates search directives from embedding-derived features, classifying candidates as high-priority, exploratory, or skippable. These directives condition a Boltzmann policy that balances exploitation with entropy-driven exploration, while hypothesis-aware pruning prevents premature abandonment of promising candidates. For candidates exhibiting score disagreement, hypothesis-conditioned embedding-space flow matching transports protein embeddings toward the binder manifold. On the SHS148k benchmark, Protein Thoughts achieves mean best-binder rank of 11.2 versus 47.7 for an entropic tree search baseline, a 76% improvement, and for binding prediction the trained value function achieves $91.08 \pm 0.19$ Micro-F1, outperforming existing PPI methods on the same dataset.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Protein Thoughts, a framework reformulating PPI discovery as interpretable search via hypothesis-guided entropy-regularized Tree-of-Thoughts combined with embedding-space flow matching. Binding evidence is decomposed into four signals (sequence similarity, structural complementarity, interface balance, chemical compatibility) preserved in a transparent value function for ranking and auditing. A fine-tuned LM generates search directives conditioning a Boltzmann policy, with hypothesis-aware pruning and flow matching for score-disagreement cases. On SHS148k, it reports mean best-binder rank of 11.2 (vs. 47.7 baseline, 76% improvement) and trained value function Micro-F1 of 91.08 ± 0.19, outperforming prior PPI methods.
Significance. If the central claims hold after addressing methodological gaps, the work could meaningfully advance q-bio by enabling mechanistic auditing of PPI predictions rather than opaque scores, potentially increasing biologist adoption. The integration of ToT with flow matching for embedding transport and the explicit multi-signal value function represent a creative direction for interpretable search in large protein spaces. The reported rank improvement and F1 score, if reproducible and free of leakage, would constitute a substantial empirical advance over entropic baselines.
major comments (3)
- [Abstract] Abstract: The reported metrics (mean best-binder rank 11.2, Micro-F1 91.08 ± 0.19) are presented without any description of training procedure, data splits, cross-validation, or leakage checks for the SHS148k benchmark. This is load-bearing for the central claim because the value function is explicitly trained and the search policy uses a fine-tuned LM; without these details it is impossible to confirm that the 76% rank improvement reflects genuine mechanistic insight rather than benchmark fitting.
- [Value function / Results] Value function description (throughout Methods and Results): The manuscript claims the value function transparently preserves the four biological signals, yet no ablation isolating each signal's contribution or external validation against independent mechanistic data is provided. This directly undermines the interpretability contribution, as high benchmark scores could arise from spurious correlations (e.g., sequence composition biases) without the function reflecting genuine biochemical decomposition.
- [Search and Flow Matching] Hypothesis-guided search and flow matching sections: The Boltzmann policy, entropy regularization, and embedding-space flow matching are introduced to handle large candidate spaces and score disagreement, but no quantitative breakdown (e.g., ablation removing flow matching or varying entropy) shows how these components drive the reported rank improvement versus the entropic tree search baseline.
minor comments (2)
- [Abstract / Methods] The abstract and text would benefit from an explicit equation defining the transparent value function and how the four signals are combined, rather than relying on prose description.
- [Figures] Figure captions for any search-tree or embedding visualizations should include quantitative metrics (e.g., number of nodes expanded, pruning rates) to allow readers to assess efficiency claims.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. The comments highlight important areas for improving clarity and empirical support. We address each major comment point by point below, indicating the revisions we will incorporate.
read point-by-point responses
-
Referee: [Abstract] Abstract: The reported metrics (mean best-binder rank 11.2, Micro-F1 91.08 ± 0.19) are presented without any description of training procedure, data splits, cross-validation, or leakage checks for the SHS148k benchmark. This is load-bearing for the central claim because the value function is explicitly trained and the search policy uses a fine-tuned LM; without these details it is impossible to confirm that the 76% rank improvement reflects genuine mechanistic insight rather than benchmark fitting.
Authors: We agree that the abstract should supply enough context for the key metrics. The Methods section specifies the value function training on SHS148k using a 70/15/15 split, 5-fold cross-validation, and a held-out set for fine-tuning the language model that generates search directives. No data leakage occurs because candidate pairs in the test set are disjoint from training. We will revise the abstract to include a concise statement of the training and validation protocol. revision: yes
-
Referee: [Value function / Results] Value function description (throughout Methods and Results): The manuscript claims the value function transparently preserves the four biological signals, yet no ablation isolating each signal's contribution or external validation against independent mechanistic data is provided. This directly undermines the interpretability contribution, as high benchmark scores could arise from spurious correlations (e.g., sequence composition biases) without the function reflecting genuine biochemical decomposition.
Authors: The value function is explicitly constructed as a weighted sum of the four signals with coefficients chosen from biochemical literature, so each term remains inspectable. While the initial submission did not contain a systematic ablation, the transparent formulation already permits per-signal auditing in the reported case studies. To strengthen the claim, we will add an ablation table that removes one signal at a time and reports the resulting drop in Micro-F1 together with qualitative examples of how the remaining signals still align with known binding mechanisms. revision: yes
-
Referee: [Search and Flow Matching] Hypothesis-guided search and flow matching sections: The Boltzmann policy, entropy regularization, and embedding-space flow matching are introduced to handle large candidate spaces and score disagreement, but no quantitative breakdown (e.g., ablation removing flow matching or varying entropy) shows how these components drive the reported rank improvement versus the entropic tree search baseline.
Authors: The primary comparison is already against the entropic tree search baseline, which isolates the net contribution of the hypothesis-guided policy plus flow matching. We acknowledge that finer-grained component ablations would make the source of the 76 % rank gain more transparent. In the revision we will add two targeted ablations: (i) the framework without embedding-space flow matching and (ii) the framework with entropy regularization disabled, each reporting the change in mean best-binder rank on SHS148k. revision: yes
Circularity Check
No significant circularity; empirical results from trained model on standard benchmark do not reduce to inputs by construction
full rationale
The paper presents an empirical ML framework combining Tree-of-Thoughts search, a trained value function, and flow matching, then reports benchmark numbers (rank 11.2, Micro-F1 91.08) on SHS148k. No derivation chain is claimed that reduces a first-principles prediction or uniqueness theorem to the training data or self-citations. The performance figures are standard held-out evaluations of a fitted system rather than a self-definitional or fitted-input-called-prediction reduction. The four-signal decomposition is presented as a modeling choice whose transparency is asserted, not proven by construction from the benchmark itself. No load-bearing self-citation or ansatz smuggling is evident in the provided text.
Axiom & Free-Parameter Ledger
free parameters (1)
- value function parameters
axioms (1)
- domain assumption The four signals (sequence similarity, structural complementarity, interface balance, chemical compatibility) reflect genuine biochemical contributions to binding
Reference graph
Works this paper leans on
-
[1]
A reference map of the human binary protein interactome.Nature, 580:402–408, 2020
Luck, K., Kim, D.K., Lamber, L., et al. A reference map of the human binary protein interactome.Nature, 580:402–408, 2020
work page 2020
-
[2]
Keskin, O., Tuncbag, N., and Gursoy, A. Predicting protein–protein interactions from the molecular to the proteome level.Chemical Reviews, 116:4884–4909, 2016
work page 2016
-
[3]
Interactome networks and human disease.Cell, 144:986–998, 2011
Vidal, M., Cusick, M.E., and Barabási, A.L. Interactome networks and human disease.Cell, 144:986–998, 2011
work page 2011
-
[4]
Sahni, N., Yi, S., Taipale, M., et al. Widespread macromolecular interaction perturbations in human genetic disorders.Cell, 161:647–660, 2015
work page 2015
-
[5]
A proteome-scale map of the human interactome network
Rolland, T., Ta¸ san, M., Charloteaux, B., et al. A proteome-scale map of the human interactome network. Cell, 159:1212–1226, 2014
work page 2014
-
[6]
Huttlin, E.L., Bruckner, R.J., Navarrete-Perea, J., et al. Dual proteome-scale networks reveal cell-specific remodeling of the human interactome.Cell, 184:3022–3040, 2021
work page 2021
-
[7]
Lin, Z., Akin, H., Rao, R., et al. Evolutionary-scale prediction of atomic-level protein structure with a language model.Science, 379:1123–1130, 2023
work page 2023
-
[8]
Rives, A., Meier, J., Sercu, T., et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.PNAS, 118:e2016239118, 2021
work page 2021
-
[9]
Highly accurate protein structure prediction with AlphaFold
Jumper, J., Evans, R., Pritzel, A., et al. Highly accurate protein structure prediction with AlphaFold. Nature, 596:583–589, 2021
work page 2021
-
[10]
Protein complex prediction with AlphaFold-Multimer.bioRxiv, 2022
Evans, R., O’Neill, M., Pritzel, A., et al. Protein complex prediction with AlphaFold-Multimer.bioRxiv, 2022
work page 2022
-
[11]
Sledzieski, S., Singh, R., Cowen, L., and Berger, B. D-SCRIPT translates genome to phenome with sequence-based, structure-aware predictions of protein-protein interactions.Cell Systems, 12:969–982, 2021
work page 2021
-
[12]
Chen, M., Ju, C.J., Zhou, G., et al. Multifaceted protein–protein interaction prediction based on Siamese residual RCNN.Bioinformatics, 35:i305–i314, 2019
work page 2019
-
[13]
Buckle, A.M., Schreiber, G., and Fersht, A.R. Protein-protein recognition: crystal structural analysis of a barnase-barstar complex at 2.0-Å resolution.Biochemistry, 33:8878–8889, 1994
work page 1994
-
[14]
Wilson, I.A. and Stanfield, R.L. Antibody-antigen interactions: new structures and new conformational changes.Current Opinion in Structural Biology, 4:857–867, 1994. 16
work page 1994
-
[15]
Wright, P.E. and Dyson, H.J. Intrinsically disordered proteins in cellular signalling and regulation.Nature Reviews Molecular Cell Biology, 16:18–29, 2015
work page 2015
-
[16]
Tree of Thoughts: deliberate problem solving with large language models
Yao, S., Yu, D., Zhao, J., et al. Tree of Thoughts: deliberate problem solving with large language models. InNeurIPS, 2023
work page 2023
-
[17]
In The Twelfth Inter- national Conference on Learning Representations
Long, J. Large language model guided Tree-of-Thought.arXiv:2305.08291, 2023
-
[18]
Chain-of-Thought prompting elicits reasoning in large language models
Wei, J., Wang, X., Schuurmans, D., et al. Chain-of-Thought prompting elicits reasoning in large language models. InNeurIPS, 2022
work page 2022
-
[19]
Application of a theory of enzyme specificity to protein synthesis.PNAS, 44:98–104, 1958
Koshland, D.E. Application of a theory of enzyme specificity to protein synthesis.PNAS, 44:98–104, 1958
work page 1958
-
[20]
Csermely, P., Palotai, R., and Nussinov, R. Induced fit, conformational selection and independent dynamic segments.Trends in Biochemical Sciences, 35:539–546, 2010
work page 2010
-
[21]
arXiv preprint arXiv:2509.15796 , year=
Liu, X., Ye, H., Lei, J., et al. Monte Carlo Tree Diffusion with multiple experts for protein design. arXiv:2509.15796, 2025
-
[22]
Needleman, S.B. and Wunsch, C.D. A general method applicable to the search for similarities in the amino acid sequence of two proteins.Journal of Molecular Biology, 48:443–453, 1970
work page 1970
-
[23]
Protein 3D structure computed from evolutionary sequence variation.PLoS ONE, 6:e28766, 2011
Marks, D.S., Colwell, L.J., Sheridan, R., et al. Protein 3D structure computed from evolutionary sequence variation.PLoS ONE, 6:e28766, 2011
work page 2011
-
[24]
Emerging methods in protein co-evolution.Nature Reviews Genetics, 14:249–261, 2013
de Juan, D., Pazos, F., and Valencia, A. Emerging methods in protein co-evolution.Nature Reviews Genetics, 14:249–261, 2013
work page 2013
-
[25]
Protein interaction networks revealed by proteome coevolution.Science, 365:185–189, 2019
Cong, Q., Anishchenko, I., Ovchinnikov, S., and Baker, D. Protein interaction networks revealed by proteome coevolution.Science, 365:185–189, 2019
work page 2019
-
[26]
Williams, A.F. and Barclay, A.N. The immunoglobulin superfamily–domains for cell surface recognition. Annual Review of Immunology, 6:381–405, 1988
work page 1988
-
[27]
A quantitative analysis of kinase inhibitor selectivity
Karaman, M.W., Herrgard, S., Treiber, D.K., et al. A quantitative analysis of kinase inhibitor selectivity. Nature Biotechnology, 26:127–132, 2008
work page 2008
-
[28]
Kabsch, W. A solution for the best rotation to relate two sets of vectors.Acta Crystallographica A, 32:922–923, 1976
work page 1976
-
[29]
Fischer, E. Einfluss der Configuration auf die Wirkung der Enzyme.Berichte der deutschen chemischen Gesellschaft, 27:2985–2993, 1894
-
[30]
Structure, function and properties of antibody binding sites
Mian, I.S., Bradwell, A.R., and Olson, A.J. Structure, function and properties of antibody binding sites. Journal of Molecular Biology, 217:133–151, 1991
work page 1991
-
[31]
Druker, B.J., Talpaz, M., Resta, D.J., et al. Efficacy and safety of a specific inhibitor of the BCR-ABL tyrosine kinase in chronic myeloid leukemia.New England Journal of Medicine, 344:1031–1037, 2001
work page 2001
-
[32]
Lo Conte, L., Chothia, C., and Janin, J. The atomic structure of protein-protein recognition sites.Journal of Molecular Biology, 285:2177–2198, 1999
work page 1999
-
[33]
Activation of apoptosis in vivo by a hydrocarbon-stapled BH3 helix.Science, 305:1466–1470, 2004
Walensky, L.D., Kung, A.L., Escher, I., et al. Activation of apoptosis in vivo by a hydrocarbon-stapled BH3 helix.Science, 305:1466–1470, 2004
work page 2004
-
[34]
Janin, J. and Chothia, C. The structure of protein-protein recognition sites.Journal of Biological Chemistry, 265:16027–16030, 1990
work page 1990
-
[35]
Bogan, A.A. and Thorn, K.S. Anatomy of hot spots in protein interfaces.Journal of Molecular Biology, 280:1–9, 1998
work page 1998
-
[36]
Chakrabarti, P. and Janin, J. Dissecting protein-protein recognition sites.Proteins, 47:334–343, 2002
work page 2002
-
[37]
Yun, C.H., Mengwasser, K.E., Toms, A.V ., et al. The T790M mutation in EGFR kinase causes drug resistance by increasing the affinity for ATP.PNAS, 105:2070–2075, 2008
work page 2070
-
[38]
Pazos, F. and Valencia, A. Similarity of phylogenetic trees as indicator of protein–protein interaction. Protein Engineering, 14:609–614, 2001. 17
work page 2001
-
[39]
Modeling purposeful adaptive behavior with the principle of maximum causal entropy
Ziebart, B.D. Modeling purposeful adaptive behavior with the principle of maximum causal entropy. PhD thesis, Carnegie Mellon University, 2010
work page 2010
-
[40]
Soft Actor-Critic: off-policy maximum entropy deep reinforcement learning
Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. Soft Actor-Critic: off-policy maximum entropy deep reinforcement learning. InICML, 2018
work page 2018
-
[41]
Hong, M., Lee, P.S., Hoffman, R.M., et al. Antibody recognition of the pandemic H1N1 influenza virus hemagglutinin receptor binding site.Journal of Virology, 87:12471–12480, 2013
work page 2013
-
[42]
Jankauskaite, J., Jimenez-Garcia, B., Dapkunas, J., Fernandez-Recio, J., and Moal, I.H. SKEMPI 2.0: an updated benchmark of changes in protein–protein binding energy, kinetics and thermodynamics upon mutation.Bioinformatics, 35:462–469, 2019
work page 2019
-
[43]
Principles of early drug discovery.British Journal of Pharmacology, 162:1239–1249, 2011
Hughes, J.P., Rees, S., Kalindjian, S.B., and Philpott, K.L. Principles of early drug discovery.British Journal of Pharmacology, 162:1239–1249, 2011
work page 2011
-
[44]
Paul, S.M., Mytelka, D.S., Dunwiddie, C.T., et al. How to improve R&D productivity: the pharmaceutical industry’s grand challenge.Nature Reviews Drug Discovery, 9:203–214, 2010
work page 2010
-
[45]
Arkin, M.R., Tang, Y ., and Wells, J.A. Small-molecule inhibitors of protein–protein interactions: progress- ing toward the reality.Chemistry & Biology, 21:1102–1114, 2014
work page 2014
-
[46]
Wells, J.A. and McClendon, C.L. Reaching for high-hanging fruit in drug discovery at protein–protein interfaces.Nature, 450:1001–1009, 2007
work page 2007
-
[47]
The coming of age of de novo protein design.Nature, 537:320– 327, 2016
Huang, P.S., Boyken, S.E., and Baker, D. The coming of age of de novo protein design.Nature, 537:320– 327, 2016
work page 2016
-
[48]
De novo design of protein structure and function with RFdiffusion.Nature, 620:1089–1100, 2023
Watson, J.L., Juergens, D., Bennett, N.R., et al. De novo design of protein structure and function with RFdiffusion.Nature, 620:1089–1100, 2023
work page 2023
-
[49]
Learning inverse folding from millions of predicted structures
Hsu, C., Verkuil, R., Liu, J., et al. Learning inverse folding from millions of predicted structures. InICML, 2022
work page 2022
-
[50]
Weng, G., Wang, E., Wang, Z., et al. HawkDock: a web server to predict and analyze the protein–protein complex.Nucleic Acids Research, 47:W322–W330, 2019
work page 2019
-
[51]
Wang, C., Greene, D., Xiao, L., Qi, R., and Luo, R. Recent developments and applications of the MMPBSA method.Frontiers in Molecular Biosciences, 4:87, 2018
work page 2018
-
[52]
Uversky, V .N. Intrinsically disordered proteins and their “mysterious” (meta)physics.Frontiers in Physics, 7:10, 2019
work page 2019
-
[53]
Flow matching for generative modeling
Lipman, Y ., Chen, R.T.Q., Ben-Hamu, H., Nickel, M., and Le, M. Flow matching for generative modeling. InICLR, 2023
work page 2023
-
[54]
Flow straight and fast: learning to generate and transfer data with rectified flow
Liu, X., Gong, C., and Liu, Q. Flow straight and fast: learning to generate and transfer data with rectified flow. InICLR, 2023
work page 2023
-
[55]
Madani, A., Krause, B., Greene, E.R., et al. Large language models generate functional protein sequences across diverse families.Nature Biotechnology, 41:1099–1106, 2023
work page 2023
-
[56]
Notin, P., Dias, M., Fraber, J., et al. Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval. InICML, 2022
work page 2022
-
[57]
BERTology meets biology: interpreting attention in protein language models
Vig, J., Madani, A., Varber, L.R., et al. BERTology meets biology: interpreting attention in protein language models. InICLR, 2020
work page 2020
-
[58]
Iterative refinement graph neural network for antibody sequence- structure co-design
Jin, W., Barzilay, R., and Jaakkola, T. Iterative refinement graph neural network for antibody sequence- structure co-design. InICLR, 2022
work page 2022
-
[59]
Conditional antibody design as 3D equivariant graph translation
Kong, X., Huang, W., and Liu, Y . Conditional antibody design as 3D equivariant graph translation. In ICLR, 2023
work page 2023
-
[60]
Mahbub, S. and Bayzid, M.S. EGRET: edge aggregated graph attention networks and transfer learning improve protein–protein interaction site prediction.Briefings in Bioinformatics, 23:bbab578, 2022
work page 2022
-
[61]
Réau, M., Renaud, N., Xue, L.C., and Bonvin, A.M. DeepRank-GNN: a graph neural network framework to learn patterns in protein–protein interfaces.Bioinformatics, 39:btac759, 2023. 18
work page 2023
-
[62]
OntoProtein: protein pretraining with gene ontology embedding
Zhang, N., Bi, Z., Liang, X., et al. OntoProtein: protein pretraining with gene ontology embedding. In ICLR, 2022
work page 2022
-
[63]
ProteinChat: towards enabling ChatGPT-like capabilities on protein 3D structures.bioRxiv, 2023
Guo, H., Huo, J., and Shi, J. ProteinChat: towards enabling ChatGPT-like capabilities on protein 3D structures.bioRxiv, 2023
work page 2023
-
[64]
ProtChatGPT: towards understanding proteins with large language models
Wang, Y ., Zhao, H., and Li, Y . ProtChatGPT: towards understanding proteins with large language models. arXiv:2402.09649, 2024
-
[65]
ProtST: multi-modality learning of protein sequences and biomedical texts
Xu, M., Yuan, X., Miber, S., and Tang, J. ProtST: multi-modality learning of protein sequences and biomedical texts. InICML, 2023
work page 2023
-
[66]
Shen, J., Zhang, J., Luo, X., et al. Predicting protein–protein interactions based only on sequences information.PNAS, 104:4337–4341, 2007
work page 2007
-
[67]
Deng, M., Mehta, S., Sun, F., and Chen, T. Inferring domain-domain interactions from protein–protein interactions.Genome Research, 12:1540–1548, 2002
work page 2002
-
[68]
Network-based prediction of protein function.Molecular Systems Biology, 3:88, 2007
Sharan, R., Ulitsky, I., and Shamir, R. Network-based prediction of protein function.Molecular Systems Biology, 3:88, 2007
work page 2007
-
[69]
Computational optimal transport.Foundations and Trends in Machine Learning, 11:355–607, 2019
Peyré, G., Cuturi, M., et al. Computational optimal transport.Foundations and Trends in Machine Learning, 11:355–607, 2019
work page 2019
-
[70]
Vaswani, A., Shazeer, N., Parmar, N., et al. Attention is all you need. InNeurIPS, 2017
work page 2017
-
[71]
BERT: pre-training of deep bidirectional transformers for language understanding
Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. InNAACL, 2019
work page 2019
-
[72]
Language models are few-shot learners
Brown, T., Mann, B., Ryder, N., et al. Language models are few-shot learners. InNeurIPS, 2020
work page 2020
-
[74]
Denoising diffusion probabilistic models
Ho, J., Jain, A., and Abbeel, P. Denoising diffusion probabilistic models. InNeurIPS, 2020
work page 2020
-
[75]
Score-based generative modeling through stochastic differential equations
Song, Y ., Sohl-Dickstein, J., Kingma, D.P., et al. Score-based generative modeling through stochastic differential equations. InICLR, 2021
work page 2021
-
[76]
and Barto, A.G.Reinforcement Learning: An Introduction
Sutton, R.S. and Barto, A.G.Reinforcement Learning: An Introduction. MIT Press, 2nd edition, 2018
work page 2018
-
[77]
Mastering the game of Go with deep neural networks and tree search.Nature, 529:484–489, 2016
Silver, D., Huang, A., Maddison, C.J., et al. Mastering the game of Go with deep neural networks and tree search.Nature, 529:484–489, 2016
work page 2016
-
[78]
Mastering Atari, Go, chess and shogi by planning with a learned model.Nature, 588:604–609, 2020
Schrittwieser, J., Antonoglou, I., Hubert, T., et al. Mastering Atari, Go, chess and shogi by planning with a learned model.Nature, 588:604–609, 2020
work page 2020
-
[79]
SE(3) diffusion model with application to protein backbone generation
Yim, J., Trippe, B.L., De Bortoli, V ., et al. SE(3) diffusion model with application to protein backbone generation. InICML, 2023
work page 2023
-
[80]
Szklarczyk, D., Gable, A. L., Nastou, K. C., Lyon, D., Kirsch, R., Pyysalo, S., Doncheva, N. T., Legeay, M., Fang, T., Bork, P., Jensen, L. J., and von Mering, C. STRING v11.5: protein–protein association networks with increased coverage supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Research, 49(D1):D605–D612, 2021
work page 2021
-
[81]
Lv, G., Hu, Z., Bi, Y ., and Zhang, S. Learning unknown from correlations: Graph neural network for inter-novel-protein interaction prediction. InProceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI), pages 3677–3683, 2021
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.