Synergistic Benefits of Joint Molecule Generation and Property Prediction
Pith reviewed 2026-05-22 17:41 UTC · model grok-4.3
The pith
Hyformer jointly generates molecules and predicts their properties with synergistic performance gains.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Hyformer is a transformer-based joint model that successfully blends the generative and predictive functionalities, using an alternating attention mechanism and a joint pre-training scheme. It is simultaneously optimized for molecule generation and property prediction, while exhibiting synergistic benefits in conditional sampling, out-of-distribution property prediction and representation learning.
What carries the argument
Alternating attention mechanism combined with joint pre-training scheme, which allows the model to alternate between generative and predictive modes while sharing learned representations.
If this is right
- Conditional generation of molecules with specified properties becomes more accurate because the model learns the joint distribution.
- Property prediction generalizes better to molecules outside the training distribution due to shared generative and predictive training.
- Representation learning improves for both tasks because each task regularizes the shared features.
- Drug design workflows can use one model to propose and score candidate antimicrobial peptides.
Where Pith is reading between the lines
- The joint approach may extend to other paired data domains such as protein sequences and their functions where generation and prediction are both needed.
- Reducing the number of separate models could lower computational overhead in screening large chemical libraries.
- Future experiments could test whether adding more property labels during pre-training further amplifies the observed synergies.
Load-bearing premise
The architectural and optimization challenges of training a single joint model for both generation and prediction can be overcome by an alternating attention mechanism and joint pre-training scheme without introducing new instabilities or loss of performance on either task.
What would settle it
Training separate specialized models for generation and for prediction, then comparing their performance on generation quality, property prediction accuracy, conditional sampling success, and out-of-distribution prediction against the joint Hyformer.
Figures
read the original abstract
Modeling the joint distribution of data samples and their properties allows to construct a single model for both data generation and property prediction, with synergistic benefits reaching beyond purely generative or predictive models. However, training joint models presents daunting architectural and optimization challenges. Here, we propose Hyformer, a transformer-based joint model that successfully blends the generative and predictive functionalities, using an alternating attention mechanism and a joint pre-training scheme. We show that Hyformer is simultaneously optimized for molecule generation and property prediction, while exhibiting synergistic benefits in conditional sampling, out-of-distribution property prediction and representation learning. Finally, we demonstrate the benefits of joint learning in a drug design use case of discovering novel antimicrobial~peptides.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Hyformer, a transformer-based joint model for molecule generation and property prediction. It employs an alternating attention mechanism and joint pre-training to address architectural and optimization challenges, claiming that the model is simultaneously optimized for both tasks and exhibits synergistic benefits in conditional sampling, out-of-distribution property prediction, representation learning, and a drug-design use case involving discovery of novel antimicrobial peptides.
Significance. If the empirical results and ablations hold, the work could meaningfully advance joint generative-predictive modeling in molecular machine learning by demonstrating that a single model can match or exceed specialized baselines while unlocking synergies not available to separate models. The alternating-attention design and joint pre-training scheme constitute a concrete architectural contribution that may generalize beyond the reported tasks.
major comments (3)
- [Section 3] Section 3: The alternating attention mechanism is presented as resolving task interference, yet the manuscript provides neither per-task loss curves nor gradient-norm statistics across training epochs. Without these diagnostics it is impossible to verify that the joint optimization truly achieves simultaneous strong performance on generation and prediction rather than trading off one objective against the other.
- [Results (Tables 1–3)] Results (Tables 1–3 and associated figures): Single-task baselines for molecule generation (validity, uniqueness, novelty) and property prediction (MAE or R² on held-out sets) are not reported. Consequently the central claim of “synergistic benefits” and absence of negative transfer cannot be quantitatively evaluated; the reported joint-model numbers alone do not establish that Hyformer matches or exceeds specialized models on each task individually.
- [OOD experiments] OOD property-prediction experiments: The improvement attributed to joint training is shown only for the full Hyformer; an ablation that freezes the generative component or trains a prediction-only variant on the same data is missing. This ablation is load-bearing for the claim that joint pre-training confers out-of-distribution robustness.
minor comments (2)
- [Abstract] Abstract: the phrase “simultaneously optimized for molecule generation and property prediction” is repeated without any numeric preview; adding one or two headline metrics would improve clarity.
- [Section 3] Notation: the alternating attention block is introduced with several new symbols; a consolidated table or diagram legend would reduce reader effort when cross-referencing equations.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback on our work. We address each of the major comments below and indicate the revisions we will make to the manuscript.
read point-by-point responses
-
Referee: [Section 3] Section 3: The alternating attention mechanism is presented as resolving task interference, yet the manuscript provides neither per-task loss curves nor gradient-norm statistics across training epochs. Without these diagnostics it is impossible to verify that the joint optimization truly achieves simultaneous strong performance on generation and prediction rather than trading off one objective against the other.
Authors: We agree that per-task loss curves and gradient-norm statistics would provide valuable evidence that the alternating attention mechanism enables simultaneous optimization without task interference. We will add these diagnostics to Section 3 in the revised manuscript. revision: yes
-
Referee: [Results (Tables 1–3)] Results (Tables 1–3 and associated figures): Single-task baselines for molecule generation (validity, uniqueness, novelty) and property prediction (MAE or R² on held-out sets) are not reported. Consequently the central claim of “synergistic benefits” and absence of negative transfer cannot be quantitatively evaluated; the reported joint-model numbers alone do not establish that Hyformer matches or exceeds specialized models on each task individually.
Authors: We recognize that reporting single-task baselines is essential to quantitatively demonstrate synergistic benefits and the absence of negative transfer. In the revised manuscript, we will include single-task baseline results in Tables 1–3 for both generation and property prediction tasks, allowing direct comparison with the joint Hyformer model. revision: yes
-
Referee: [OOD experiments] OOD property-prediction experiments: The improvement attributed to joint training is shown only for the full Hyformer; an ablation that freezes the generative component or trains a prediction-only variant on the same data is missing. This ablation is load-bearing for the claim that joint pre-training confers out-of-distribution robustness.
Authors: We agree that an ablation comparing the full model to a prediction-only variant is necessary to substantiate the OOD robustness benefits from joint pre-training. We will incorporate this ablation into the OOD experiments section of the revised manuscript. revision: yes
Circularity Check
No circularity; empirical claims rest on experimental results rather than self-referential derivations
full rationale
The paper introduces Hyformer, a transformer architecture for joint molecule generation and property prediction via alternating attention and joint pre-training. All central claims—simultaneous optimization, synergies in conditional sampling, OOD prediction, and representation learning—are presented as outcomes of empirical evaluation on benchmarks and a drug-design case study. No equations, fitted parameters, or predictions are described that reduce by construction to the model's own inputs or to self-citations. The work contains no load-bearing uniqueness theorems, ansatzes smuggled via prior self-work, or renamings of known patterns; results are validated externally against specialized models and datasets.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Modeling the joint distribution of data samples and their properties allows construction of a single model with synergistic benefits beyond purely generative or predictive models.
invented entities (1)
-
Hyformer
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
HYFORMER unifies a decoder with an encoder using a transformer backbone fθ(x; [TASK]) conditioned on a task token [TASK] ∈ { [LM], [PRED], [MLM]}. ... ATT_Type = → if [TASK] = [LM], ↔ if [TASK] ∈ {[PRED], [MLM]}.
-
IndisputableMonolith/Foundation/BranchSelection.leanbranch_selection unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
ℓHYFORMER = ℓLM + µℓMLM + ηℓPRED
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
J. Arús-Pous, S. V . Johansson, O. Prykhodko, E. J. Bjerrum, C. Tyrchan, J.-L. Reymond, H. Chen, and O. Engkvist. Randomized smiles strings improve the quality of molecular generative models. Journal of cheminformatics, 11:1–13, 2019
work page 2019
-
[2]
V . Bagal, R. Aggarwal, P. K. Vinod, and U. D. Priyakumar. MolGPT: Molecular Generation Using a Transformer-Decoder Model. Journal of Chemical Information and Modeling, 62(9): 2064–2076, May 2022. ISSN 1549-9596. doi: 10.1021/acs.jcim.1c00600
-
[3]
X. Bi, C. Wang, W. Dong, W. Zhu, and D. Shang. Antimicrobial properties and interaction of two trp-substituted cationic antimicrobial peptides with a lipid bilayer. The Journal of Antibiotics, 67(5):361–368, 2014
work page 2014
-
[4]
C. M. Bishop. Novelty detection and neural network validation. IEE Proceedings-Vision, Image and Signal processing, 141(4):217–222, 1994
work page 1994
-
[5]
E. J. Bjerrum. Smiles enumeration as data augmentation for neural network modeling of molecules, 2017
work page 2017
-
[6]
J. Born and M. Manica. Regression transformer enables concurrent sequence regression and generation for molecular language modelling. Nature Machine Intelligence, 5(4):432–444, 2023
work page 2023
- [7]
-
[8]
ISSN 1549-9596. doi: 10.1021/acs.jcim.8b00839
-
[9]
G. Cabas-Mora, A. Daza, N. Soto-García, V . Garrido, D. Alvarez, M. Navarrete, L. Sarmiento- Varón, J. H. Sepúlveda Yañez, M. D. Davari, F. Cadet, Á. Olivera-Nappa, R. Uribe-Paredes, and D. Medina-Ortiz. Peptipedia v2.0: A peptide sequence database and user-friendly web platform. a major update. bioRxiv, 2024. doi: 10.1101/2024.07.11.603053. URL https: //...
- [10]
-
[11]
T. Chen and C. Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016
work page 2016
-
[12]
T. Chen, P. Vure, R. Pulugurta, and P. Chatterjee. AMP-diffusion: Integrating latent diffusion with protein language models for antimicrobial peptide generation. In NeurIPS 2023 Generative AI and Biology (GenBio) Workshop, 2023
work page 2023
-
[13]
T. U. Consortium. Uniprot: the universal protein knowledgebase in 2025. Nucleic Acids Research, 53(D1):D609–D617, 11 2024. ISSN 1362-4962. doi: 10.1093/nar/gkae1010. URL https://doi.org/10.1093/nar/gkae1010
-
[14]
P. Das, K. Wadhawan, O. Chang, T. Sercu, C. D. Santos, M. Riemer, V . Chenthamarakshan, I. Padhi, and A. Mojsilovic. Pepcvae: Semi-supervised targeted design of antimicrobial peptide sequences, 2018
work page 2018
-
[15]
L. Dong, N. Yang, W. Wang, F. Wei, X. Liu, Y . Wang, J. Gao, M. Zhou, and H.-W. Hon. Unified language model pre-training for natural language understanding and generation. Advances in neural information processing systems, 32, 2019
work page 2019
- [16]
-
[17]
X. Fang, L. Liu, J. Lei, D. He, S. Zhang, J. Zhou, F. Wang, H. Wu, and H. Wang. Geometry- enhanced molecular representation learning for property prediction. Nature Machine Intelli- gence, 4(2):127–134, 2022. 10
work page 2022
- [18]
-
[19]
D. Flam-Shepherd, K. Zhu, and A. Aspuru-Guzik. Language models can learn complex molecular distributions. Nature Communications, 13(1):3293, 2022
work page 2022
- [20]
-
[21]
L. Gao, J. Schulman, and J. Hilton. Scaling laws for reward model overoptimization. In International Conference on Machine Learning, pages 10835–10866. PMLR, 2023
work page 2023
-
[22]
Z. Gao, D. Dong, C. Tan, J. Xia, B. Hu, and S. Z. Li. A graph is worth k words: Euclideanizing graph using pure transformer. In R. Salakhutdinov, Z. Kolter, K. Heller, A. Weller, N. Oliver, J. Scarlett, and F. Berkenkamp, editors, Proceedings of the 41st International Conference on Machine Learning , volume 235 of Proceedings of Machine Learning Research ...
work page 2024
-
[23]
Z. Geng, S. Xie, Y . Xia, L. Wu, T. Qin, J. Wang, Y . Zhang, F. Wu, and T.-Y . Liu. De novo molecular generation via connection-aware motif mining, 2023
work page 2023
-
[24]
F. Gers and E. Schmidhuber. Lstm recurrent networks learn simple context-free and context- sensitive languages. IEEE Transactions on Neural Networks, 12(6):1333–1340, 2001. doi: 10.1109/72.963769
-
[25]
R. Gómez-Bombarelli, J. N. Wei, D. Duvenaud, J. M. Hernández-Lobato, B. Sánchez-Lengeling, D. Sheberla, J. Aguilera-Iparraguirre, T. D. Hirzel, R. P. Adams, and A. Aspuru-Guzik. Auto- matic Chemical Design Using a Data-Driven Continuous Representation of Molecules. ACS Central Science, 4(2):268–276, Feb. 2018. ISSN 2374-7943. doi: 10.1021/acscentsci.7b00572
-
[26]
W. Grathwohl, K.-C. Wang, J.-H. Jacobsen, D. Duvenaud, M. Norouzi, and K. Swersky. Your classifier is secretly an energy based model and you should treat it like one, 2020
work page 2020
-
[27]
F. Grisoni. Chemical language models for de novo drug design: Challenges and opportunities. Current Opinion in Structural Biology, 79:102527, 2023
work page 2023
- [28]
- [29]
-
[30]
E. Hoogeboom, V . G. Satorras, C. Vignac, and M. Welling. Equivariant diffusion for molecule generation in 3d. In International conference on machine learning, pages 8867–8887. PMLR, 2022
work page 2022
- [31]
- [32]
-
[33]
T. Jaakkola and D. Haussler. Exploiting generative models in discriminative classifiers.Advances in neural information processing systems, 11, 1998
work page 1998
-
[34]
W. Jin, R. Barzilay, and T. Jaakkola. Junction tree variational autoencoder for molecular graph generation, 2019
work page 2019
-
[35]
D. P. Kingma and M. Welling. Auto-encoding variational bayes.arXiv preprint arXiv:1312.6114, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[36]
Self-Referencing Embedded Strings (SELFIES): A 100% Robust Molecular String Representation
M. Krenn, F. Häse, A. Nigam, P. Friederich, and A. Aspuru-Guzik. Self-referencing embedded strings (selfies): A 100Machine Learning: Science and Technology, 1(4):045024, Oct. 2020. ISSN 2632-2153. doi: 10.1088/2632-2153/aba947. 11
-
[37]
P.-K. Lai, D. T. Tresnak, and B. J. Hackel. Identification and elucidation of proline-rich antimicrobial peptides with enhanced potency and delivery. Biotechnology and bioengineering, 116(10):2439–2450, 2019
work page 2019
-
[38]
J. A. Lasserre, C. M. Bishop, and T. P. Minka. Principled hybrids of generative and discrimi- native models. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), volume 1, pages 87–94. IEEE, 2006
work page 2006
-
[39]
T. J. Lawrence, D. L. Carper, M. K. Spangler, A. A. Carrell, T. A. Rush, S. J. Minter, D. J. Weston, and J. L. Labbé. ampeppy 1.0: a portable and accurate antimicrobial peptide prediction tool. Bioinformatics, 37(14):2058–2060, 11 2020. ISSN 1367-4803. doi: 10.1093/bioinformatics/ btaa917. URL https://doi.org/10.1093/bioinformatics/btaa917
-
[40]
C. Li, D. Sutherland, S. A. Hammond, C. Yang, F. Taho, L. Bergman, S. Houston, R. L. Warren, T. Wong, L. M. N. Hoang, C. E. Cameron, C. C. Helbing, and I. Birol. AMPlify: attentive deep learning model for discovery of novel antimicrobial peptides effective against WHO priority pathogens. BMC Genomics, 23(1):77, Jan. 2022
work page 2022
-
[41]
P. Li, J. Wang, Y . Qiao, H. Chen, Y . Yu, X. Yao, P. Gao, G. Xie, and S. Song. An effective self-supervised framework for learning expressive molecular global representations to drug discovery. Briefings in Bioinformatics, 22(6):bbab109, 2021
work page 2021
-
[42]
T. Li, X. Ren, X. Luo, Z. Wang, Z. Li, X. Luo, J. Shen, Y . Li, D. Yuan, R. Nussinov, X. Zeng, J. Shi, and F. Cheng. A foundation model identifies broad-spectrum antimicrobial peptides against drug-resistant bacterial infection. Nat. Commun., 15(1):7538, Aug. 2024
work page 2024
-
[43]
Z. Lin, H. Akin, R. Rao, B. Hie, Z. Zhu, W. Lu, N. Smetanin, R. Verkuil, O. Kabeli, Y . Shmueli, A. dos Santos Costa, M. Fazel-Zarandi, T. Sercu, S. Candido, and A. Rives. Evolutionary- scale prediction of atomic-level protein structure with a language model. Science, 379(6637): 1123–1130, 2023. doi: 10.1126/science.ade2574
- [44]
-
[45]
Q. Liu, M. Allamanis, M. Brockschmidt, and A. Gaunt. Constrained graph variational autoen- coders for molecule design. Advances in neural information processing systems, 31, 2018
work page 2018
-
[46]
S. Liu, M. F. Demirel, and Y . Liang. N-gram graph: Simple unsupervised representation for graphs, with applications to molecules. Advances in neural information processing systems, 32, 2019
work page 2019
- [47]
-
[48]
Y . Luo, K. Yan, and S. Ji. Graphdf: A discrete flow model for molecular graph generation. In International conference on machine learning, pages 7192–7203. PMLR, 2021
work page 2021
-
[49]
K. Maziarz, H. Jackson-Flux, P. Cameron, F. Sirockin, N. Schneider, N. Stiefl, M. Segler, and M. Brockschmidt. Learning to extend molecular scaffolds with structural motifs, 2022
work page 2022
-
[50]
Controlled decoding from language models
S. Mudgal, J. Lee, H. Ganapathy, Y . Li, T. Wang, Y . Huang, Z. Chen, H.-T. Cheng, M. Collins, T. Strohman, et al. Controlled decoding from language models.arXiv preprint arXiv:2310.17022, 2023
-
[51]
E. Nalisnick, A. Matsukawa, Y . W. Teh, D. Gorur, and B. Lakshminarayanan. Hybrid models with deep and invertible features. In International Conference on Machine Learning, pages 4723–4732. PMLR, 2019
work page 2019
-
[52]
R. Özçelik, L. van Weesep, S. de Ruiter, and F. Grisoni. peptidy: A light-weight python library for peptide representation in machine learning. 2025
work page 2025
-
[53]
K. B. Petersen, M. S. Pedersen, et al. The matrix cookbook. Technical University of Denmark, 7(15):510, 2008. 12
work page 2008
-
[54]
R. Rafailov, A. Sharma, E. Mitchell, C. D. Manning, S. Ermon, and C. Finn. Direct preference optimization: Your language model is secretly a reward model. Advances in Neural Information Processing Systems, 36:53728–53741, 2023
work page 2023
- [55]
-
[56]
Y . Rong, Y . Bian, T. Xu, W. Xie, Y . Wei, W. Huang, and J. Huang. Self-supervised graph transformer on large-scale molecular data. Advances in neural information processing systems, 33:12559–12571, 2020
work page 2020
-
[57]
C. D. Santos-Júnior, Y . Duan, H. Chong, T. S. Schmidt, A. Fullam, P. Bork, X.-M. Zhao, and L. P. Coelho. Ampsphere : the worldwide survey of prokaryotic antimicrobial peptides, May
-
[58]
URL https://doi.org/10.5281/zenodo.6511404
-
[59]
P. Schwaller, D. Probst, A. C. Vaucher, V . H. Nair, D. Kreutter, T. Laino, and J.-L. Reymond. Mapping the space of chemical reactions using attention-based neural networks. ChemRxiv,
-
[60]
doi: 10.26434/chemrxiv.9897365.v4
-
[61]
M. H. Segler, T. Kogej, C. Tyrchan, and M. P. Waller. Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS central science, 4(1):120–131, 2018
work page 2018
- [62]
-
[63]
S. Steshin. Lo-hi: Practical ml drug discovery benchmark. In Advances in Neural Information Processing Systems, 2023
work page 2023
-
[64]
J. M. Stokes, K. Yang, K. Swanson, W. Jin, A. Cubillos-Ruiz, N. M. Donghia, C. R. MacNair, S. French, L. A. Carfrae, Z. Bloom-Ackermann, et al. A deep learning approach to antibiotic discovery. Cell, 180(4):688–702, 2020
work page 2020
- [65]
-
[66]
P. Szymczak, M. Mo ˙zejko, T. Grzegorzek, R. Jurczak, M. Bauer, D. Neubauer, K. Sikora, M. Michalski, J. Sroka, P. Setny, W. Kamysz, and E. Szczurek. Discovering highly potent antimicrobial peptides with deep generative model hydramp. bioRxiv, 2023. doi: 10.1101/2022. 01.27.478054
-
[67]
J. M. Tomczak. Deep generative modeling for neural compression. In Deep Generative Modeling. Springer, 2022
work page 2022
-
[68]
M. D. T. Torres, T. Chen, F. Wan, P. Chatterjee, and C. de la Fuente-Nunez. Generative latent diffusion language modeling yields anti-infective synthetic peptides. bioRxiv, 2025. doi: 10. 1101/2025.01.31.636003. URL https://www.biorxiv.org/content/early/2025/02/ 01/2025.01.31.636003
work page 2025
-
[69]
Llama 2: Open Foundation and Fine-Tuned Chat Models
H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y . Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[70]
C. M. Van Oort, J. B. Ferrell, J. M. Remington, S. Wshah, and J. Li. Ampgan v2: Ma- chine learning-guided design of antimicrobial peptides. Journal of Chemical Information and Modeling, 61(5):2198–2207, 2021. doi: 10.1021/acs.jcim.0c01441. PMID: 33787250
-
[71]
D. van Tilborg, L. Rossen, and F. Grisoni. Molecular deep learning at the edge of chemical space. 2025
work page 2025
-
[72]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is All you Need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. 13
work page 2017
-
[73]
S. Wang, Y . Guo, Y . Wang, H. Sun, and J. Huang. Smiles-bert: large scale unsupervised pre-training for molecular property prediction. In Proceedings of the 10th ACM international conference on bioinformatics, computational biology and health informatics, pages 429–436, 2019
work page 2019
-
[74]
Y . Wang, J. Wang, Z. Cao, and A. Barati Farimani. Molecular contrastive learning of represen- tations via graph neural networks. Nature Machine Intelligence, 4(3):279–287, 2022
work page 2022
-
[75]
D. Weininger. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Computer Sciences, 28 (1):31–36, Feb. 1988. ISSN 0095-2338. doi: 10.1021/ci00057a005
-
[76]
Z. Wu, B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing, and V . Pande. Moleculenet: A benchmark for molecular machine learning, 2018
work page 2018
-
[77]
J. Xia, C. Zhao, B. Hu, Z. Gao, C. Tan, Y . Liu, S. Li, and S. Z. Li. Mole-BERT: Rethinking pre-training graph neural networks for molecules. In The Eleventh International Conference on Learning Representations, 2023
work page 2023
-
[78]
Y .-T. Xiang, G.-Y . Huang, X.-X. Shi, G.-F. Hao, and G.-F. Yang. 3d molecular generation models expand chemical space exploration in drug design. Drug Discovery Today, page 104282, 2024
work page 2024
- [79]
-
[80]
K. Yang, K. Swanson, W. Jin, C. Coley, P. Eiden, H. Gao, A. Guzman-Perez, T. Hopper, B. Kelley, M. Mathea, et al. Analyzing learned molecular representations for property prediction. Journal of chemical information and modeling, 59(8):3370–3388, 2019
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.