SmileyLlama: Modifying Large Language Models for Directed Chemical Space Exploration

Andrew Gritsevskiy; Dorian Bagni; Joseph M. Cavanagh; Kunyang Sun; Teresa Head-Gordon; Thomas D. Bannister; Yingze Wang

arxiv: 2409.02231 · v5 · submitted 2024-09-03 · ⚛️ physics.chem-ph · cs.LG

SmileyLlama: Modifying Large Language Models for Directed Chemical Space Exploration

Joseph M. Cavanagh , Kunyang Sun , Andrew Gritsevskiy , Dorian Bagni , Yingze Wang , Thomas D. Bannister , Teresa Head-Gordon This is my paper

Pith reviewed 2026-05-23 21:14 UTC · model grok-4.3

classification ⚛️ physics.chem-ph cs.LG

keywords large language modelssupervised fine-tuningchemical space explorationdrug discoverymolecular generationdirect preference optimizationSmileyLlama

0 comments

The pith

Large language models can be fine-tuned with engineered prompts to generate drug-like molecules with user-specified properties.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that large language models can be transformed via supervised fine-tuning of engineered prompts into SmileyLlama, which explores chemical space to produce valid and novel drug molecules. This matters to a sympathetic reader because it lets users direct molecule generation through familiar language interfaces rather than relying on separate specialized models or chatbots with only general chemistry knowledge. The work further applies direct preference optimization both to tighten prompt adherence and to support an optimization loop for molecules with desired 3D shapes and strong binding to targets. A reader cares about the outcome because the same base model keeps most of its original language abilities while performing the chemical task.

Core claim

By training an LLM to speak directly as a chemical language model through supervised fine-tuning on engineered prompts, SmileyLlama reliably generates molecules that match user-specified properties. Direct preference optimization strengthens adherence to those prompts and integrates into a reinforcement learning setup that favors molecules with optimized conformations and high binding affinity. The resulting system is benchmarked against both general pre-trained language models and chemical language models trained from scratch, while the overall supervised fine-tuning plus preference optimization approach is presented as extensible beyond drug discovery.

What carries the argument

Supervised fine-tuning of engineered prompts combined with direct preference optimization, which converts the LLM into SmileyLlama for directed molecular generation.

If this is right

SmileyLlama produces valid and novel drug-like molecules at rates comparable to or better than chemical language models trained from scratch.
Direct preference optimization inside the iMiner framework yields molecules with improved 3D conformations and higher binding affinity to chosen targets.
The model continues to handle ordinary natural language queries alongside its chemical generation task.
The supervised fine-tuning and direct preference optimization steps can be applied to other chemical, biological, or materials generation problems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Chemists could steer molecule design through ordinary conversational instructions instead of writing specialized code or SMILES strings.
The retained language abilities open the possibility of chaining the model with other language-based tools for multi-step design reasoning.
Similar prompt-based adaptation might shorten the data requirements when moving the same base model to new scientific domains.

Load-bearing premise

That supervised fine-tuning on engineered prompts plus direct preference optimization will produce reliable molecule generation matching user properties while keeping most natural language capabilities intact.

What would settle it

A controlled test set of prompts that specify concrete molecular properties such as molecular weight range or binding score threshold, followed by independent chemical validation showing that the generated structures fail to meet those properties at rates comparable to or worse than baseline models.

Figures

Figures reproduced from arXiv: 2409.02231 by Andrew Gritsevskiy, Dorian Bagni, Joseph M. Cavanagh, Kunyang Sun, Teresa Head-Gordon, Thomas D. Bannister, Yingze Wang.

**Figure 1.** Figure 1: A visualization of the SFT workflow for Smiley-Llama. Given the Llama-3.1-8B-Instruct model 26, we used prompt-response pairs consisting of calculated molecular properties and completed SMILES strings to fine-tune Llama on SMILES strings completions, yielding SmileyLlama. Crucially, we construct the prompt for each example using properties calculated from the correct response (a SMILES string from ChEMBLv3… view at source ↗

**Figure 2.** Figure 2: Distribution comparisons for different properties of the generated molecules from SmileyLlama (blue) with molecules from the training dataset from ChEMBL (gold). (a) UMAP visualization of a random selection of 10,000 ChEMBL molecules and 10,000 SmileyLlama-generated molecules, using 15 neighbors and a minimum distance of 0.1. (b) The molecular properties considered are fraction of sp3 hybridized carbons an… view at source ↗

**Figure 3.** Figure 3: Conditional generation with SmileyLlama for fragment growth and before and after DPO compared to ChEMBL. (a) Example molecules generated by growing from one of the Enamine substructures and to satisfy Lipinski’s Rule of 5 using the prompt Output a SMILES string for a drug like molecule with the following properties: a substructure of O=C(O)c1ccc(C(F)(F)F)cc1, <= 500 MW, <=5 logP, <= 5 H-bond donors, <= 10 … view at source ↗

**Figure 4.** Figure 4: Comparison of the shift in docking score distributions for iMiner compared to SmileyLlama over optimization epochs as illustrated for SARS2-MPro. (a) For iMiner, in later epochs diversity crashes which explains the sharpening peaks in later iterations. SmileyLlama with DPO (SL+DPO) enforces diversity throughout the optimizations (Algorithm S3), which accounts for the broad peaks, and shows superior data ef… view at source ↗

**Figure 5.** Figure 5: SmileyLlama de novo generated molecules in the active site of SARS2 main protease. Surface rendering of the SmileyLlama generated molecules in the SARS2 Mpro canonical binding pocket. Generated by SmileyLlama after optimization with (a) the SARS2PRO prompt. (b) and (c) the SARS2Pro+Ro5 prompt. Supplementary Table S2 provides their SMILES string and docking scores, and Supplementary Figure S3 shows their do… view at source ↗

read the original abstract

We show that large language model (LLMs) can be transformed via supervised fine-tuning (SFT) of engineered prompts into SmileyLlama for exploring the chemical space of drug molecules. We benchmark SmileyLlama against pre-trained LLMs and chemical language models (CLM) trained from scratch for generating valid and novel drug-like molecules, and use direct preference optimization (DPO) to both improve SmileyLlama's adherence to a prompt and as part of the iMiner reinforcement learning framework to predict molecules with optimized 3D conformations and high binding affinity to drug targets. By training an LLM to speak directly as a CLM, while retaining most of its natural language capabilities, we show that we can reliably generate molecules with user-specified properties rather than acting only as a chatbot with knowledge of chemistry or as a virtual assistant. While SmileyLlama is geared toward drug discovery, the SFT/DPO/LLM framework can be extended to other chemical, biological, and materials applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SmileyLlama applies SFT and DPO to steer an LLM toward user-specified molecules, but the abstract supplies no numbers, error bars, or controls to verify retained natural-language performance.

read the letter

The core move here is taking a general LLM, running supervised fine-tuning on engineered prompts so it behaves like a chemical language model, then layering DPO both for better prompt adherence and inside the iMiner loop for 3D affinity. That pipeline is the actual new piece; it is a direct application of existing alignment methods to the chemistry setting rather than a new algorithm. If the full paper shows concrete gains on validity, novelty, and property matching over both plain LLMs and scratch-trained CLMs, plus any evidence that general language tasks did not collapse, the approach could be a practical shortcut for groups that already have a strong base model and want to avoid training chemistry models from scratch. The suggestion that the same framework could transfer to materials or biology is reasonable on its face. The main gap is the one the stress-test note flags. The abstract asserts that the model keeps most natural-language capabilities while acting as a directed generator, yet it gives no before-and-after metrics on anything outside the chemistry task, no ablation separating prompt engineering from capability loss, and no quantitative results at all on the claimed benchmarks. Without those controls it is impossible to tell whether the model has truly become a reliable dual-purpose system or has simply narrowed. The iMiner stage for conformations adds another untested assumption about the same weights handling both language and 3D optimization. This work is aimed at people already doing LLM fine-tuning for drug discovery who need a concrete recipe rather than a from-scratch CLM. A reader in that niche could extract the training outline and try it, but the lack of reported evidence limits how far the claims can be trusted right now. The thinking is coherent and the citations follow standard practice for the area. I would send it to peer review so referees can check whether the full manuscript actually contains the missing benchmarks and controls; the idea is straightforward enough that it is worth seeing the data.

Referee Report

2 major / 1 minor

Summary. The manuscript claims that large language models can be transformed via supervised fine-tuning (SFT) of engineered prompts into SmileyLlama for exploring the chemical space of drug molecules. It benchmarks this model against pre-trained LLMs and chemical language models (CLMs) for generating valid and novel drug-like molecules, employs direct preference optimization (DPO) to improve prompt adherence, and integrates DPO with the iMiner reinforcement learning framework to predict molecules with optimized 3D conformations and high binding affinity to targets. The central assertion is that this produces reliable generation of molecules with user-specified properties while retaining most natural language capabilities, with potential extension to other chemical and materials applications.

Significance. If the quantitative benchmarks, error bars, and controls for natural-language retention are supplied and hold, the work could offer a meaningful bridge between flexible natural-language interfaces and directed chemical generation, potentially enabling more intuitive prompt-based exploration than standalone CLMs in drug discovery.

major comments (2)

[Abstract] Abstract: the assertion of benchmarks against pre-trained LLMs and CLMs plus improvements via DPO supplies no numerical results, error bars, or method details, rendering the central performance claims unverifiable from the text.
[Abstract] Abstract: the load-bearing claim that SFT/DPO produces a model that both generates molecules with user-specified properties and retains most natural-language capabilities lacks any before/after metrics on general NL tasks (e.g., MMLU, GSM8K) or chemistry QA, and no ablation controls isolating prompt engineering from capability loss are described.

minor comments (1)

The iMiner framework and the precise form of the engineered prompts would benefit from explicit pseudocode or algorithmic description to allow reproduction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed feedback on the abstract. We agree that the abstract requires strengthening with quantitative results and will revise it in the next version of the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the assertion of benchmarks against pre-trained LLMs and CLMs plus improvements via DPO supplies no numerical results, error bars, or method details, rendering the central performance claims unverifiable from the text.

Authors: We agree the abstract is currently too high-level. In revision we will insert the key numerical benchmark outcomes (validity, novelty, and uniqueness rates versus baselines; DPO-induced gains in prompt adherence) together with standard deviations from repeated runs, plus a one-sentence pointer to the methods section for experimental details. revision: yes
Referee: [Abstract] Abstract: the load-bearing claim that SFT/DPO produces a model that both generates molecules with user-specified properties and retains most natural-language capabilities lacks any before/after metrics on general NL tasks (e.g., MMLU, GSM8K) or chemistry QA, and no ablation controls isolating prompt engineering from capability loss are described.

Authors: The primary scope of the work is directed chemical generation rather than general LLM evaluation. We will revise the abstract to qualify the retention statement and, where internal chemistry-specific QA results exist, include a brief before/after comparison. Full ablation details on prompt engineering versus capability retention appear in the supplementary material of the current manuscript. revision: partial

Circularity Check

0 steps flagged

No significant circularity; claims rest on empirical SFT/DPO training

full rationale

The paper describes transforming LLMs via supervised fine-tuning on engineered prompts into SmileyLlama, with subsequent DPO for prompt adherence and iMiner for 3D affinity optimization, then benchmarking generation of valid/novel molecules against pre-trained LLMs and CLMs. No mathematical derivations, equations, or fitted parameters are invoked that reduce claimed performance to self-definition or construction from inputs. The central premise of dual molecule generation plus retained natural-language capability is presented as an outcome of new training rather than a self-citation chain, uniqueness theorem, or renamed known result. This is a standard empirical training paper whose results are externally falsifiable via the described benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no information on free parameters, axioms, or invented entities; all ledger entries are therefore empty.

pith-pipeline@v0.9.0 · 5731 in / 1058 out tokens · 18695 ms · 2026-05-23T21:14:54.380440+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

85 extracted references · 85 canonical work pages · 11 internal anchors

[1]

Chemical language models for de novo drug design: Challenges and opportunities

Grisoni, F. Chemical language models for de novo drug design: Challenges and opportunities. 79, 102527

work page
[2]

SMILES, a chemical language and information system

Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 1988, 28, 31–36

work page 1988
[3]

Self-referencing embedded strings (SELFIES): A 100

Krenn, M.; Hase, F.; Nigam, A.; Friederich, P.; Aspuru-Guzik, A. Self-referencing embedded strings (SELFIES): A 100

work page
[4]

De Cao and T

Cao, N. D.; Kipf, T. MolGAN: An implicit generative model for small molecular graphs.ArXiv 2018, abs/1805.11973, null

work page arXiv 2018
[5]

Generative Models for De Novo Drug Design

Tong, X.; Liu, X.; Tan, X.; Li, X.; Jiang, J.; Xiong, Z.; Xu, T.; Jiang, H.; Qiao, N.; Zheng, M. Generative Models for De Novo Drug Design. Journal of medicinal chemistry 2021, null, null

work page 2021
[6]

Language models can learn complex molecular distributions

Flam-Shepherd, D.; Zhu, K.; Aspuru-Guzik, A. Language models can learn complex molecular distributions. Nature Communications 2022, 13, 3293

work page 2022
[7]

Chemical language models enable navigation in sparsely populated chemical space

Skinnider, M.; Stacey, R.; Wishart, D.; Foster, L. Chemical language models enable navigation in sparsely populated chemical space. Nature Machine Intelligence 2021, 3, 759 – 770

work page 2021
[8]

REINVENT 2.0: An AI Tool for De Novo Drug Design

Blaschke, T.; Arús-Pous, J.; Chen, H.; Margreitter, C.; Tyrchan, C.; Engkvist, O.; Papadopoulos, K.; Patronov, A. REINVENT 2.0: An AI Tool for De Novo Drug Design. Journal of chemical information and modeling 2020, null, null

work page 2020
[9]

Chemical language models for de novo drug design: Challenges and opportunities

Grisoni, F. Chemical language models for de novo drug design: Challenges and opportunities. Current opinion in structural biology 2023, 79, 102527

work page 2023
[10]

J.; Bento, A

Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y .; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P. ChEMBL: a large-scale bioactivity database for drug discovery. 40, D1100–D1107

work page
[11]

I.; Tang, K

Tingle, B. I.; Tang, K. G.; Castanon, M.; Gutierrez, J. J.; Khurelbaatar, M.; Dandarchuluun, C.; Moroz, Y . S.; Irwin, J. J. ZINC-22–A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery.J. Chem. Info. Model. 2023, 63, 1166–1176. 12

work page 2023
[12]

N.; Duvenaud, D.; Hernández-Lobato, J

Gómez-Bombarelli, R.; Wei, J. N.; Duvenaud, D.; Hernández-Lobato, J. M.; Sánchez-Lengeling, B.; Sheberla, D.; Aguilera-Iparraguirre, J.; Hirzel, T. D.; Adams, R. P.; Aspuru-Guzik, A. Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules. ACS Central Science 2018, 4, 268–276

work page 2018
[13]

Long Short-Term Memory

Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Computation 1997, 9, 1735–1780

work page 1997
[14]

T.; Huisman, B

Gupta, A.; Müller, A. T.; Huisman, B. J. H.; Fuchs, J. A.; Schneider, P.; Schneider, G. Generative Recurrent Networks for De Novo Drug Design. Molecular Informatics 2017, 37

work page 2017
[15]

Improving Language Understanding by Generative Pre- Training

Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I. Improving Language Understanding by Generative Pre- Training. 2018; https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/ language-unsupervised/language_understanding_paper.pdf

work page 2018
[16]

K.; Priyakumar, U

Bagal, V .; Aggarwal, R.; Vinod, P. K.; Priyakumar, U. MolGPT: Molecular Generation Using a Transformer-Decoder Model. Journal of chemical information and modeling 2021, 62, 2064–2076

work page 2021
[17]

Efficiently Modeling Long Sequences with Structured State Spaces

Gu, A.; Goel, K.; Ré, C. Efficiently Modeling Long Sequences with Structured State Spaces. 2022; https: //arxiv.org/abs/2111.00396

work page internal anchor Pith review Pith/arXiv arXiv 2022
[18]

Chemical language modeling with structured state space sequence models

Özçelik, R.; de Ruiter, S.; Criscuolo, E.; Grisoni, F. Chemical language modeling with structured state space sequence models. Nature Communications 2024, 15, 6176

work page 2024
[19]

cMolGPT: A Conditional Generative Pre-Trained Transformer for Target-Specific De Novo Molecular Generation.Molecules 2023, 28, null

Wang, Y .; Zhao, H.; Sciabola, S.; Wang, W. cMolGPT: A Conditional Generative Pre-Trained Transformer for Target-Specific De Novo Molecular Generation.Molecules 2023, 28, null

work page 2023
[20]

N.; Riley, P

Zhou, Z.; Kearnes, S.; Li, L.; Zare, R. N.; Riley, P. Optimization of Molecules via Deep Reinforcement Learning. Scientific Reports 2019, 9, 10752, Published: 24 July 2019

work page 2019
[21]

L.; Parks, C.; Amaro, R

Li, J.; Zhang, O.; Sun, K.; Wang, Y .; Guan, X.; Bagni, D.; Haghighatlari, M.; Kearns, F. L.; Parks, C.; Amaro, R. E.; Head-Gordon, T. Mining for Potent Inhibitors through Artificial Intelligence and Physics: A Unified Methodology for Ligand Based and Structure Based Drug Design. Journal of Chemical Information and Modeling 2024,

work page 2024
[22]

Two decades of statistical language modeling: where do we go from here? Proceedings of the IEEE 2000, 88, 1270–1278

Rosenfeld, R. Two decades of statistical language modeling: where do we go from here? Proceedings of the IEEE 2000, 88, 1270–1278

work page 2000
[23]

Attention Is All You Need

Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. 2023; https://arxiv.org/abs/1706.03762

work page internal anchor Pith review Pith/arXiv arXiv 2023
[24]

Language Models are Unsupervised Multitask Learners

Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language Models are Unsupervised Multitask Learners. 2018,

work page 2018
[25]

GPT-4 Technical Report

OpenAI et al. GPT-4 Technical Report. http://arxiv.org/abs/2303.08774

work page internal anchor Pith review Pith/arXiv arXiv
[26]

Dubey, A. et al. The Llama 3 Herd of Models. http://arxiv.org/abs/2407.21783

work page internal anchor Pith review Pith/arXiv arXiv
[27]

L.; Rampal, N.; Alawadhi, A

Zheng, Z.; Zhang, O.; Nguyen, H. L.; Rampal, N.; Alawadhi, A. H.; Rong, Z.; Head-Gordon, T.; Borgs, C.; Chayes, J. T.; Yaghi, O. M. ChatGPT Research Group for Optimizing the Crystallinity of MOFs and COFs.ACS Cent. Sci. 2023, 9, 2161–2170

work page 2023
[28]

A.; MacKnight, R.; Kline, B.; Gomes, G

Boiko, D. A.; MacKnight, R.; Kline, B.; Gomes, G. Autonomous chemical research with large language models. 624, 570–578, Publisher: Nature Publishing Group

work page
[29]

Bran, A.; Cox, S.; Schilter, O.; Baldassari, C.; White, A

M. Bran, A.; Cox, S.; Schilter, O.; Baldassari, C.; White, A. D.; Schwaller, P. Augmenting large language models with chemistry tools. 6, 525–535, Publisher: Nature Publishing Group

work page
[30]

Translation between Molecules and Natural Language

Edwards, C.; Lai, T.; Ros, K.; Honke, G.; Cho, K.; Ji, H. Translation between Molecules and Natural Language. 2022; http://arxiv.org/abs/2204.11817, arXiv:2204.11817 [cs]. 13

work page arXiv 2022
[31]

BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations

Pei, Q.; Zhang, W.; Zhu, J.; Wu, K.; Gao, K.; Wu, L.; Xia, Y .; Yan, R. BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations. ArXiv 2023, abs/2310.07276, null

work page arXiv 2023
[32]

N.; Chen, Z.; Ning, X.; Sun, H

Yu, B.; Baker, F. N.; Chen, Z.; Ning, X.; Sun, H. LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset. 2024; https://arxiv.org/abs/ 2402.09391

work page arXiv 2024
[33]

Efficient Evolutionary Search Over Chemical Space with Large Language Models

Wang, H.; Skreta, M.; Ser, C.-T.; Gao, W.; Kong, L.; Strieth-Kalthoff, F.; Duan, C.; Zhuang, Y .; Yu, Y .; Zhu, Y .; Du, Y .; Aspuru-Guzik, A.; Neklyudov, K.; Zhang, C. Efficient Evolutionary Search Over Chemical Space with Large Language Models. 2024; http://arxiv.org/abs/2406.16976, arXiv:2406.16976 [physics]

work page arXiv 2024
[34]

Small Molecule Optimization with Large Language Models

Guevorguian, P.; Bedrosian, M.; Fahradyan, T.; Chilingaryan, G.; Khachatrian, H.; Aghajanyan, A. Small Molecule Optimization with Large Language Models. http://arxiv.org/abs/2407.18897, version: 1

work page arXiv
[35]

Large Language Models as Molecular Design Engines

Bhattacharya, D.; Cassady, H.; Hickner, M.; Reinhart, W. Large Language Models as Molecular Design Engines. 2024; https://chemrxiv.org/engage/chemrxiv/article-details/ 664c98ea418a5379b0e07d31

work page 2024
[36]

DrugLLM: Open Large Language Model for Few-shot Molecule Generation

Liu, X.; Guo, Y .; Li, H.; Liu, J.; Huang, S.; Ke, B.; Lv, J. DrugLLM: Open Large Language Model for Few-shot Molecule Generation. ArXiv 2024,

work page 2024
[37]

Leveraging language model for advanced multiproperty molecular optimization via prompt engineering

Wu, Z.; Zhang, O.; Wang, X.; Fu, L.; Zhao, H.; Wang, J.; Du, H.; Jiang, D.; Deng, Y .; Cao, D.; Hsieh, C.-Y .; Hou, T. Leveraging language model for advanced multiproperty molecular optimization via prompt engineering. 1–11, Publisher: Nature Publishing Group

work page
[38]

J.; Elattar, M

Ahmed, S. J.; Elattar, M. A. Improving Targeted Molecule Generation through Language Model Fine-Tuning Via Reinforcement Learning. 2024; http://arxiv.org/abs/2405.06836, arXiv:2405.06836 [cs, q-bio]

work page arXiv 2024
[39]

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Rafailov, R.; Sharma, A.; Mitchell, E.; Ermon, S.; Manning, C. D.; Finn, C. Direct Preference Optimization: Your Language Model is Secretly a Reward Model. http://arxiv.org/abs/2305.18290

work page internal anchor Pith review Pith/arXiv arXiv
[40]

Recent Progress in the Drug Development Targeting SARS-CoV-2 Main Protease as Treatment for COVID-19

Cui, W.; Yang, K.; Yang, H. Recent Progress in the Drug Development Targeting SARS-CoV-2 Main Protease as Treatment for COVID-19. Front. Mol. Biosci. 2020, 7, 398

work page 2020
[41]

M.; Wang, Y .; Sawyer, J

Sun, K.; Bagni, D.; Cavanagh, J. M.; Wang, Y .; Sawyer, J. M.; Gritsevskiy, A.; Zhang, O.; Head-Gordon, T. SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models. 2025; https: //arxiv.org/abs/2503.12602

work page arXiv 2025
[42]

RDKit: Open-Source Cheminformatics Software

Landrum, G. RDKit: Open-Source Cheminformatics Software. 2016,

work page 2016
[43]

C.; Murray, C

Jhoti, H.; Williams, G.; Rees, D. C.; Murray, C. W. The ’rule of three’ for fragment-based drug discovery: where are we now? 12, 644–644, Publisher: Nature Publishing Group

work page
[44]

F.; Johnson, S

Veber, D. F.; Johnson, S. R.; Cheng, H.-Y .; Smith, B. R.; Ward, K. W.; Kopple, K. D. Molecular Properties That Influence the Oral Bioavailability of Drug Candidates. 45, 2615–2623, Publisher: American Chemical Society

work page
[45]

Lian, W. axolotl. URL https://github.com/axolotl-ai-cloud/axolotl/tree/main. https://github.com/ axolotl-ai-cloud/axolotl/tree/main

work page
[46]

LoRA: Low-Rank Adaptation of Large Language Models

Hu, E. J.; Shen, Y .; Wallis, P.; Allen-Zhu, Z.; Li, Y .; Wang, S.; Wang, L.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. http://arxiv.org/abs/2106.09685

work page internal anchor Pith review Pith/arXiv arXiv
[47]

Y .; Ermon, S.; Rudra, A.; Ré, C

Dao, T.; Fu, D. Y .; Ermon, S.; Rudra, A.; Ré, C. FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. Advances in Neural Information Processing Systems (NeurIPS). 2022. 14

work page 2022
[48]

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Dao, T. FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning. International Conference on Learning Representations (ICLR). 2024

work page 2024
[49]

Adam: A Method for Stochastic Optimization

Kingma, D. P.; Ba, J. Adam: A Method for Stochastic Optimization. http://arxiv.org/abs/1412.6980

work page internal anchor Pith review Pith/arXiv arXiv
[50]

Taori, R.; Gulrajani, I.; Zhang, T.; Dubois, Y .; Li, X.; Guestrin, C.; Liang, P.; Hashimoto, T. B. Stanford Alpaca: An Instruction-following LLaMA model. https://github.com/tatsu-lab/stanford_alpaca, 2023

work page 2023
[51]

A.; Khashabi, D.; Hajishirzi, H

Wang, Y .; Kordi, Y .; Mishra, S.; Liu, A.; Smith, N. A.; Khashabi, D.; Hajishirzi, H. Self-Instruct: Aligning Language Models with Self-Generated Instructions. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (V olume 1: Long Papers). Toronto, Canada, 2023; pp 13484–13508

work page 2023
[52]

H.; Vaucher, A

Brown, N.; Fiscato, M.; Segler, M. H.; Vaucher, A. C. GuacaMol: Benchmarking Models for de Novo Molecular Design. Journal of Chemical Information and Modeling 2019, 59, 1096–1108

work page 2019
[53]

Fréchet ChemNet distance: a metric for generative models for molecules in drug discovery

Preuer, K.; Renz, P.; Unterthiner, T.; Hochreiter, S.; Klambauer, G. Fréchet ChemNet distance: a metric for generative models for molecules in drug discovery. J. Chem. Inform. Model. 2018, 58, 1736–1741

work page 2018
[54]

VGAE-MCTS: A New Molecular Gener- ative Model Combining the Variational Graph Auto-Encoder and Monte Carlo Tree Search

Iwata, H.; Nakai, T.; Koyama, T.; Matsumoto, S.; Kojima, R.; Okuno, Y . VGAE-MCTS: A New Molecular Gener- ative Model Combining the Variational Graph Auto-Encoder and Monte Carlo Tree Search. Journal of Chemical Information and Modeling 2023, 63, 7392–7400

work page 2023
[55]

R.; Paolini, G

Bickerton, G. R.; Paolini, G. V .; Besnard, J.; Muresan, S.; Hopkins, A. L. Quantifying the chemical beauty of drugs. Nature Chem. 2012, 4, 90–98

work page 2012
[56]

A.; Crippen, G

Wildman, S. A.; Crippen, G. M. Prediction of physicochemical parameters by atomic contributions. J. Chem. Inform. Comp. Sci. 1999, 39, 868–873

work page 1999
[57]

Topological polar surface area: a useful descriptor in 2D-QSAR

S, P.; RJ, D. Topological polar surface area: a useful descriptor in 2D-QSAR. Curr Med Chem 2009, 16, 21–41

work page 2009
[58]

H.; Frearson, J.; Wyatt, P

Brenk, R.; Schipani, A.; James, D.; Krasowski, A.; Gilbert, I. H.; Frearson, J.; Wyatt, P. G. Lessons learnt from assembling screening libraries for drug discovery for neglected diseases. ChemMedChem 2008, 3, 435

work page 2008
[59]

NumeroLogic: Number Encoding for Enhanced LLMs’ Numerical Reasoning

Schwartz, E.; Choshen, L.; Shtok, J.; Doveh, S.; Karlinsky, L.; Arbelle, A. NumeroLogic: Number Encoding for Enhanced LLMs’ Numerical Reasoning. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Miami, Florida, USA, 2024; pp 206–212

work page 2024
[60]

Multivariate Density Estimation; John Wiley & Sons, Ltd, 1992; Chapter 6, pp 125–193

work page 1992
[61]

https://enamine.net/compound-libraries/ fragment-libraries/essential-library, Accessed: 2024-08-23

Enamine Essential Fragment Library. https://enamine.net/compound-libraries/ fragment-libraries/essential-library, Accessed: 2024-08-23

work page 2024
[62]

A.; Lombardo, F.; Dominy, B

Lipinski, C. A.; Lombardo, F.; Dominy, B. W.; Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings1PII of original article: S0169-409X(96)00423-

work page
[63]

Advanced Drug Delivery Reviews 2001, 46, 3–26, Special issue dedicated to Dr

The article was originally published in Advanced Drug Delivery Reviews 23 (1997) 3–25.1. Advanced Drug Delivery Reviews 2001, 46, 3–26, Special issue dedicated to Dr. Eric Tomlinson, Advanced Drug Delivery Reviews, A Selection of the Most Highly Cited Articles, 1991-1998

work page 1997
[64]

Preference Optimization for Molecular Language Models

Park, R.; Theisen, R.; Sahni, N.; Patek, M.; Cicho ´nska, A.; Rahman, R. Preference Optimization for Molecular Language Models. http://arxiv.org/abs/2310.12304

work page arXiv
[65]

Molecular de-novo design through deep reinforcement learning

Olivecrona, M.; Blaschke, T.; Engkvist, O.; Chen, H. Molecular de-novo design through deep reinforcement learning. J. Cheminform. 2017, 9, 1–14

work page 2017
[66]

Deep reinforcement learning for de novo drug design.Sci

Popova, M.; Isayev, O.; Tropsha, A. Deep reinforcement learning for de novo drug design.Sci. Adv. 2018, 4, eaap7885. 15

work page 2018
[67]

Trott, O.; Olson, A. J. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comp. Chem. 2010, 31, 455–461

work page 2010
[68]

S.; Socher, R

Merity, S.; Keskar, N. S.; Socher, R. Regularizing and Optimizing LSTM Language Models. International Conference on Learning Representations. 2018

work page 2018
[69]

Proximal Policy Optimization Algorithms

Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal Policy Optimization Algorithms.CoRR 2017, abs/1707.06347

work page internal anchor Pith review Pith/arXiv arXiv 2017
[70]

Jin, Z.; Du, X.; Xu, Y .; Deng, Y .; Liu, M.; Zhao, Y .; Zhang, B.; Li, X.; Zhang, L.; Peng, C.; others Structure of Mpro from SARS-CoV-2 and discovery of its inhibitors.Nature 2020, 582, 289–293

work page 2020
[71]

Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitors

Zhang, L.; Lin, D.; Sun, X.; Curth, U.; Drosten, C.; Sauerhering, L.; Becker, S.; Rox, K.; Hilgenfeld, R. Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitors. Science 2020, 368, 409–412

work page 2020
[72]

Zhang, C.-H. et al. Potent Noncovalent Inhibitors of the Main Protease of SARS-CoV-2 from Molecular Sculpting of the Drug Perampanel Guided by Free Energy Perturbation Calculations. ACS Central Science 2021, 7, 467–475

work page 2021
[73]

TTD: Therapeutic Target Database describing target druggability information

Zhou, Y .; Zhang, Y .; Zhao, D.; Yu, X.; Shen, X.; Zhou, Y .; Wang, S.; Qiu, Y .; Chen, Y .; Zhu, F. TTD: Therapeutic Target Database describing target druggability information. Nucleic Acids Res 2024, 52, D1465–d1477

work page 2024
[74]

Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions

Ertl, P.; Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. Journal of Cheminformatics 2009, 1, 8

work page 2009
[75]

Gao, L. et al. A framework for few-shot language model evaluation. 2024; https://zenodo.org/records/ 12608602

work page 2024
[76]

Measuring Massive Multitask Language Understanding

Hendrycks, D.; Burns, C.; Basart, S.; Zou, A.; Mazeika, M.; Song, D.; Steinhardt, J. Measuring Massive Multitask Language Understanding

work page
[77]

Wang, Y . et al. MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark. 2024; https://arxiv.org/abs/2406.01574

work page internal anchor Pith review Pith/arXiv arXiv 2024
[78]

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Rein, D.; Hou, B. L.; Stickland, A. C.; Petty, J.; Pang, R. Y .; Dirani, J.; Michael, J.; Bowman, S. R. GPQA: A Graduate-Level Google-Proof Q&A Benchmark. 2023; https://arxiv.org/abs/2311.12022

work page internal anchor Pith review Pith/arXiv arXiv 2023
[79]

Measuring Mathematical Problem Solving With the MATH Dataset

Hendrycks, D.; Burns, C.; Kadavath, S.; Arora, A.; Basart, S.; Tang, E.; Song, D.; Steinhardt, J. Measuring Mathematical Problem Solving With the MATH Dataset. 2021;https://arxiv.org/abs/2103.03874

work page internal anchor Pith review Pith/arXiv arXiv 2021
[80]

Gema, A. P. et al. Are We Done with MMLU? Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (V olume 1: Long Papers). Albuquerque, New Mexico, 2025; pp 5069–5096

work page 2025

Showing first 80 references.

[1] [1]

Chemical language models for de novo drug design: Challenges and opportunities

Grisoni, F. Chemical language models for de novo drug design: Challenges and opportunities. 79, 102527

work page

[2] [2]

SMILES, a chemical language and information system

Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 1988, 28, 31–36

work page 1988

[3] [3]

Self-referencing embedded strings (SELFIES): A 100

Krenn, M.; Hase, F.; Nigam, A.; Friederich, P.; Aspuru-Guzik, A. Self-referencing embedded strings (SELFIES): A 100

work page

[4] [4]

De Cao and T

Cao, N. D.; Kipf, T. MolGAN: An implicit generative model for small molecular graphs.ArXiv 2018, abs/1805.11973, null

work page arXiv 2018

[5] [5]

Generative Models for De Novo Drug Design

Tong, X.; Liu, X.; Tan, X.; Li, X.; Jiang, J.; Xiong, Z.; Xu, T.; Jiang, H.; Qiao, N.; Zheng, M. Generative Models for De Novo Drug Design. Journal of medicinal chemistry 2021, null, null

work page 2021

[6] [6]

Language models can learn complex molecular distributions

Flam-Shepherd, D.; Zhu, K.; Aspuru-Guzik, A. Language models can learn complex molecular distributions. Nature Communications 2022, 13, 3293

work page 2022

[7] [7]

Chemical language models enable navigation in sparsely populated chemical space

Skinnider, M.; Stacey, R.; Wishart, D.; Foster, L. Chemical language models enable navigation in sparsely populated chemical space. Nature Machine Intelligence 2021, 3, 759 – 770

work page 2021

[8] [8]

REINVENT 2.0: An AI Tool for De Novo Drug Design

Blaschke, T.; Arús-Pous, J.; Chen, H.; Margreitter, C.; Tyrchan, C.; Engkvist, O.; Papadopoulos, K.; Patronov, A. REINVENT 2.0: An AI Tool for De Novo Drug Design. Journal of chemical information and modeling 2020, null, null

work page 2020

[9] [9]

Chemical language models for de novo drug design: Challenges and opportunities

Grisoni, F. Chemical language models for de novo drug design: Challenges and opportunities. Current opinion in structural biology 2023, 79, 102527

work page 2023

[10] [10]

J.; Bento, A

Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y .; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P. ChEMBL: a large-scale bioactivity database for drug discovery. 40, D1100–D1107

work page

[11] [11]

I.; Tang, K

Tingle, B. I.; Tang, K. G.; Castanon, M.; Gutierrez, J. J.; Khurelbaatar, M.; Dandarchuluun, C.; Moroz, Y . S.; Irwin, J. J. ZINC-22–A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery.J. Chem. Info. Model. 2023, 63, 1166–1176. 12

work page 2023

[12] [12]

N.; Duvenaud, D.; Hernández-Lobato, J

Gómez-Bombarelli, R.; Wei, J. N.; Duvenaud, D.; Hernández-Lobato, J. M.; Sánchez-Lengeling, B.; Sheberla, D.; Aguilera-Iparraguirre, J.; Hirzel, T. D.; Adams, R. P.; Aspuru-Guzik, A. Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules. ACS Central Science 2018, 4, 268–276

work page 2018

[13] [13]

Long Short-Term Memory

Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Computation 1997, 9, 1735–1780

work page 1997

[14] [14]

T.; Huisman, B

Gupta, A.; Müller, A. T.; Huisman, B. J. H.; Fuchs, J. A.; Schneider, P.; Schneider, G. Generative Recurrent Networks for De Novo Drug Design. Molecular Informatics 2017, 37

work page 2017

[15] [15]

Improving Language Understanding by Generative Pre- Training

Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I. Improving Language Understanding by Generative Pre- Training. 2018; https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/ language-unsupervised/language_understanding_paper.pdf

work page 2018

[16] [16]

K.; Priyakumar, U

Bagal, V .; Aggarwal, R.; Vinod, P. K.; Priyakumar, U. MolGPT: Molecular Generation Using a Transformer-Decoder Model. Journal of chemical information and modeling 2021, 62, 2064–2076

work page 2021

[17] [17]

Efficiently Modeling Long Sequences with Structured State Spaces

Gu, A.; Goel, K.; Ré, C. Efficiently Modeling Long Sequences with Structured State Spaces. 2022; https: //arxiv.org/abs/2111.00396

work page internal anchor Pith review Pith/arXiv arXiv 2022

[18] [18]

Chemical language modeling with structured state space sequence models

Özçelik, R.; de Ruiter, S.; Criscuolo, E.; Grisoni, F. Chemical language modeling with structured state space sequence models. Nature Communications 2024, 15, 6176

work page 2024

[19] [19]

cMolGPT: A Conditional Generative Pre-Trained Transformer for Target-Specific De Novo Molecular Generation.Molecules 2023, 28, null

Wang, Y .; Zhao, H.; Sciabola, S.; Wang, W. cMolGPT: A Conditional Generative Pre-Trained Transformer for Target-Specific De Novo Molecular Generation.Molecules 2023, 28, null

work page 2023

[20] [20]

N.; Riley, P

Zhou, Z.; Kearnes, S.; Li, L.; Zare, R. N.; Riley, P. Optimization of Molecules via Deep Reinforcement Learning. Scientific Reports 2019, 9, 10752, Published: 24 July 2019

work page 2019

[21] [21]

L.; Parks, C.; Amaro, R

Li, J.; Zhang, O.; Sun, K.; Wang, Y .; Guan, X.; Bagni, D.; Haghighatlari, M.; Kearns, F. L.; Parks, C.; Amaro, R. E.; Head-Gordon, T. Mining for Potent Inhibitors through Artificial Intelligence and Physics: A Unified Methodology for Ligand Based and Structure Based Drug Design. Journal of Chemical Information and Modeling 2024,

work page 2024

[22] [22]

Two decades of statistical language modeling: where do we go from here? Proceedings of the IEEE 2000, 88, 1270–1278

Rosenfeld, R. Two decades of statistical language modeling: where do we go from here? Proceedings of the IEEE 2000, 88, 1270–1278

work page 2000

[23] [23]

Attention Is All You Need

Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. 2023; https://arxiv.org/abs/1706.03762

work page internal anchor Pith review Pith/arXiv arXiv 2023

[24] [24]

Language Models are Unsupervised Multitask Learners

Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language Models are Unsupervised Multitask Learners. 2018,

work page 2018

[25] [25]

GPT-4 Technical Report

OpenAI et al. GPT-4 Technical Report. http://arxiv.org/abs/2303.08774

work page internal anchor Pith review Pith/arXiv arXiv

[26] [26]

Dubey, A. et al. The Llama 3 Herd of Models. http://arxiv.org/abs/2407.21783

work page internal anchor Pith review Pith/arXiv arXiv

[27] [27]

L.; Rampal, N.; Alawadhi, A

Zheng, Z.; Zhang, O.; Nguyen, H. L.; Rampal, N.; Alawadhi, A. H.; Rong, Z.; Head-Gordon, T.; Borgs, C.; Chayes, J. T.; Yaghi, O. M. ChatGPT Research Group for Optimizing the Crystallinity of MOFs and COFs.ACS Cent. Sci. 2023, 9, 2161–2170

work page 2023

[28] [28]

A.; MacKnight, R.; Kline, B.; Gomes, G

Boiko, D. A.; MacKnight, R.; Kline, B.; Gomes, G. Autonomous chemical research with large language models. 624, 570–578, Publisher: Nature Publishing Group

work page

[29] [29]

Bran, A.; Cox, S.; Schilter, O.; Baldassari, C.; White, A

M. Bran, A.; Cox, S.; Schilter, O.; Baldassari, C.; White, A. D.; Schwaller, P. Augmenting large language models with chemistry tools. 6, 525–535, Publisher: Nature Publishing Group

work page

[30] [30]

Translation between Molecules and Natural Language

Edwards, C.; Lai, T.; Ros, K.; Honke, G.; Cho, K.; Ji, H. Translation between Molecules and Natural Language. 2022; http://arxiv.org/abs/2204.11817, arXiv:2204.11817 [cs]. 13

work page arXiv 2022

[31] [31]

BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations

Pei, Q.; Zhang, W.; Zhu, J.; Wu, K.; Gao, K.; Wu, L.; Xia, Y .; Yan, R. BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations. ArXiv 2023, abs/2310.07276, null

work page arXiv 2023

[32] [32]

N.; Chen, Z.; Ning, X.; Sun, H

Yu, B.; Baker, F. N.; Chen, Z.; Ning, X.; Sun, H. LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset. 2024; https://arxiv.org/abs/ 2402.09391

work page arXiv 2024

[33] [33]

Efficient Evolutionary Search Over Chemical Space with Large Language Models

Wang, H.; Skreta, M.; Ser, C.-T.; Gao, W.; Kong, L.; Strieth-Kalthoff, F.; Duan, C.; Zhuang, Y .; Yu, Y .; Zhu, Y .; Du, Y .; Aspuru-Guzik, A.; Neklyudov, K.; Zhang, C. Efficient Evolutionary Search Over Chemical Space with Large Language Models. 2024; http://arxiv.org/abs/2406.16976, arXiv:2406.16976 [physics]

work page arXiv 2024

[34] [34]

Small Molecule Optimization with Large Language Models

Guevorguian, P.; Bedrosian, M.; Fahradyan, T.; Chilingaryan, G.; Khachatrian, H.; Aghajanyan, A. Small Molecule Optimization with Large Language Models. http://arxiv.org/abs/2407.18897, version: 1

work page arXiv

[35] [35]

Large Language Models as Molecular Design Engines

Bhattacharya, D.; Cassady, H.; Hickner, M.; Reinhart, W. Large Language Models as Molecular Design Engines. 2024; https://chemrxiv.org/engage/chemrxiv/article-details/ 664c98ea418a5379b0e07d31

work page 2024

[36] [36]

DrugLLM: Open Large Language Model for Few-shot Molecule Generation

Liu, X.; Guo, Y .; Li, H.; Liu, J.; Huang, S.; Ke, B.; Lv, J. DrugLLM: Open Large Language Model for Few-shot Molecule Generation. ArXiv 2024,

work page 2024

[37] [37]

Leveraging language model for advanced multiproperty molecular optimization via prompt engineering

Wu, Z.; Zhang, O.; Wang, X.; Fu, L.; Zhao, H.; Wang, J.; Du, H.; Jiang, D.; Deng, Y .; Cao, D.; Hsieh, C.-Y .; Hou, T. Leveraging language model for advanced multiproperty molecular optimization via prompt engineering. 1–11, Publisher: Nature Publishing Group

work page

[38] [38]

J.; Elattar, M

Ahmed, S. J.; Elattar, M. A. Improving Targeted Molecule Generation through Language Model Fine-Tuning Via Reinforcement Learning. 2024; http://arxiv.org/abs/2405.06836, arXiv:2405.06836 [cs, q-bio]

work page arXiv 2024

[39] [39]

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Rafailov, R.; Sharma, A.; Mitchell, E.; Ermon, S.; Manning, C. D.; Finn, C. Direct Preference Optimization: Your Language Model is Secretly a Reward Model. http://arxiv.org/abs/2305.18290

work page internal anchor Pith review Pith/arXiv arXiv

[40] [40]

Recent Progress in the Drug Development Targeting SARS-CoV-2 Main Protease as Treatment for COVID-19

Cui, W.; Yang, K.; Yang, H. Recent Progress in the Drug Development Targeting SARS-CoV-2 Main Protease as Treatment for COVID-19. Front. Mol. Biosci. 2020, 7, 398

work page 2020

[41] [41]

M.; Wang, Y .; Sawyer, J

Sun, K.; Bagni, D.; Cavanagh, J. M.; Wang, Y .; Sawyer, J. M.; Gritsevskiy, A.; Zhang, O.; Head-Gordon, T. SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models. 2025; https: //arxiv.org/abs/2503.12602

work page arXiv 2025

[42] [42]

RDKit: Open-Source Cheminformatics Software

Landrum, G. RDKit: Open-Source Cheminformatics Software. 2016,

work page 2016

[43] [43]

C.; Murray, C

Jhoti, H.; Williams, G.; Rees, D. C.; Murray, C. W. The ’rule of three’ for fragment-based drug discovery: where are we now? 12, 644–644, Publisher: Nature Publishing Group

work page

[44] [44]

F.; Johnson, S

Veber, D. F.; Johnson, S. R.; Cheng, H.-Y .; Smith, B. R.; Ward, K. W.; Kopple, K. D. Molecular Properties That Influence the Oral Bioavailability of Drug Candidates. 45, 2615–2623, Publisher: American Chemical Society

work page

[45] [45]

Lian, W. axolotl. URL https://github.com/axolotl-ai-cloud/axolotl/tree/main. https://github.com/ axolotl-ai-cloud/axolotl/tree/main

work page

[46] [46]

LoRA: Low-Rank Adaptation of Large Language Models

Hu, E. J.; Shen, Y .; Wallis, P.; Allen-Zhu, Z.; Li, Y .; Wang, S.; Wang, L.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. http://arxiv.org/abs/2106.09685

work page internal anchor Pith review Pith/arXiv arXiv

[47] [47]

Y .; Ermon, S.; Rudra, A.; Ré, C

Dao, T.; Fu, D. Y .; Ermon, S.; Rudra, A.; Ré, C. FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. Advances in Neural Information Processing Systems (NeurIPS). 2022. 14

work page 2022

[48] [48]

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Dao, T. FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning. International Conference on Learning Representations (ICLR). 2024

work page 2024

[49] [49]

Adam: A Method for Stochastic Optimization

Kingma, D. P.; Ba, J. Adam: A Method for Stochastic Optimization. http://arxiv.org/abs/1412.6980

work page internal anchor Pith review Pith/arXiv arXiv

[50] [50]

Taori, R.; Gulrajani, I.; Zhang, T.; Dubois, Y .; Li, X.; Guestrin, C.; Liang, P.; Hashimoto, T. B. Stanford Alpaca: An Instruction-following LLaMA model. https://github.com/tatsu-lab/stanford_alpaca, 2023

work page 2023

[51] [51]

A.; Khashabi, D.; Hajishirzi, H

Wang, Y .; Kordi, Y .; Mishra, S.; Liu, A.; Smith, N. A.; Khashabi, D.; Hajishirzi, H. Self-Instruct: Aligning Language Models with Self-Generated Instructions. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (V olume 1: Long Papers). Toronto, Canada, 2023; pp 13484–13508

work page 2023

[52] [52]

H.; Vaucher, A

Brown, N.; Fiscato, M.; Segler, M. H.; Vaucher, A. C. GuacaMol: Benchmarking Models for de Novo Molecular Design. Journal of Chemical Information and Modeling 2019, 59, 1096–1108

work page 2019

[53] [53]

Fréchet ChemNet distance: a metric for generative models for molecules in drug discovery

Preuer, K.; Renz, P.; Unterthiner, T.; Hochreiter, S.; Klambauer, G. Fréchet ChemNet distance: a metric for generative models for molecules in drug discovery. J. Chem. Inform. Model. 2018, 58, 1736–1741

work page 2018

[54] [54]

VGAE-MCTS: A New Molecular Gener- ative Model Combining the Variational Graph Auto-Encoder and Monte Carlo Tree Search

Iwata, H.; Nakai, T.; Koyama, T.; Matsumoto, S.; Kojima, R.; Okuno, Y . VGAE-MCTS: A New Molecular Gener- ative Model Combining the Variational Graph Auto-Encoder and Monte Carlo Tree Search. Journal of Chemical Information and Modeling 2023, 63, 7392–7400

work page 2023

[55] [55]

R.; Paolini, G

Bickerton, G. R.; Paolini, G. V .; Besnard, J.; Muresan, S.; Hopkins, A. L. Quantifying the chemical beauty of drugs. Nature Chem. 2012, 4, 90–98

work page 2012

[56] [56]

A.; Crippen, G

Wildman, S. A.; Crippen, G. M. Prediction of physicochemical parameters by atomic contributions. J. Chem. Inform. Comp. Sci. 1999, 39, 868–873

work page 1999

[57] [57]

Topological polar surface area: a useful descriptor in 2D-QSAR

S, P.; RJ, D. Topological polar surface area: a useful descriptor in 2D-QSAR. Curr Med Chem 2009, 16, 21–41

work page 2009

[58] [58]

H.; Frearson, J.; Wyatt, P

Brenk, R.; Schipani, A.; James, D.; Krasowski, A.; Gilbert, I. H.; Frearson, J.; Wyatt, P. G. Lessons learnt from assembling screening libraries for drug discovery for neglected diseases. ChemMedChem 2008, 3, 435

work page 2008

[59] [59]

NumeroLogic: Number Encoding for Enhanced LLMs’ Numerical Reasoning

Schwartz, E.; Choshen, L.; Shtok, J.; Doveh, S.; Karlinsky, L.; Arbelle, A. NumeroLogic: Number Encoding for Enhanced LLMs’ Numerical Reasoning. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Miami, Florida, USA, 2024; pp 206–212

work page 2024

[60] [60]

Multivariate Density Estimation; John Wiley & Sons, Ltd, 1992; Chapter 6, pp 125–193

work page 1992

[61] [61]

https://enamine.net/compound-libraries/ fragment-libraries/essential-library, Accessed: 2024-08-23

Enamine Essential Fragment Library. https://enamine.net/compound-libraries/ fragment-libraries/essential-library, Accessed: 2024-08-23

work page 2024

[62] [62]

A.; Lombardo, F.; Dominy, B

Lipinski, C. A.; Lombardo, F.; Dominy, B. W.; Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings1PII of original article: S0169-409X(96)00423-

work page

[63] [63]

Advanced Drug Delivery Reviews 2001, 46, 3–26, Special issue dedicated to Dr

The article was originally published in Advanced Drug Delivery Reviews 23 (1997) 3–25.1. Advanced Drug Delivery Reviews 2001, 46, 3–26, Special issue dedicated to Dr. Eric Tomlinson, Advanced Drug Delivery Reviews, A Selection of the Most Highly Cited Articles, 1991-1998

work page 1997

[64] [64]

Preference Optimization for Molecular Language Models

Park, R.; Theisen, R.; Sahni, N.; Patek, M.; Cicho ´nska, A.; Rahman, R. Preference Optimization for Molecular Language Models. http://arxiv.org/abs/2310.12304

work page arXiv

[65] [65]

Molecular de-novo design through deep reinforcement learning

Olivecrona, M.; Blaschke, T.; Engkvist, O.; Chen, H. Molecular de-novo design through deep reinforcement learning. J. Cheminform. 2017, 9, 1–14

work page 2017

[66] [66]

Deep reinforcement learning for de novo drug design.Sci

Popova, M.; Isayev, O.; Tropsha, A. Deep reinforcement learning for de novo drug design.Sci. Adv. 2018, 4, eaap7885. 15

work page 2018

[67] [67]

Trott, O.; Olson, A. J. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comp. Chem. 2010, 31, 455–461

work page 2010

[68] [68]

S.; Socher, R

Merity, S.; Keskar, N. S.; Socher, R. Regularizing and Optimizing LSTM Language Models. International Conference on Learning Representations. 2018

work page 2018

[69] [69]

Proximal Policy Optimization Algorithms

Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal Policy Optimization Algorithms.CoRR 2017, abs/1707.06347

work page internal anchor Pith review Pith/arXiv arXiv 2017

[70] [70]

Jin, Z.; Du, X.; Xu, Y .; Deng, Y .; Liu, M.; Zhao, Y .; Zhang, B.; Li, X.; Zhang, L.; Peng, C.; others Structure of Mpro from SARS-CoV-2 and discovery of its inhibitors.Nature 2020, 582, 289–293

work page 2020

[71] [71]

Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitors

Zhang, L.; Lin, D.; Sun, X.; Curth, U.; Drosten, C.; Sauerhering, L.; Becker, S.; Rox, K.; Hilgenfeld, R. Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitors. Science 2020, 368, 409–412

work page 2020

[72] [72]

Zhang, C.-H. et al. Potent Noncovalent Inhibitors of the Main Protease of SARS-CoV-2 from Molecular Sculpting of the Drug Perampanel Guided by Free Energy Perturbation Calculations. ACS Central Science 2021, 7, 467–475

work page 2021

[73] [73]

TTD: Therapeutic Target Database describing target druggability information

Zhou, Y .; Zhang, Y .; Zhao, D.; Yu, X.; Shen, X.; Zhou, Y .; Wang, S.; Qiu, Y .; Chen, Y .; Zhu, F. TTD: Therapeutic Target Database describing target druggability information. Nucleic Acids Res 2024, 52, D1465–d1477

work page 2024

[74] [74]

Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions

Ertl, P.; Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. Journal of Cheminformatics 2009, 1, 8

work page 2009

[75] [75]

Gao, L. et al. A framework for few-shot language model evaluation. 2024; https://zenodo.org/records/ 12608602

work page 2024

[76] [76]

Measuring Massive Multitask Language Understanding

Hendrycks, D.; Burns, C.; Basart, S.; Zou, A.; Mazeika, M.; Song, D.; Steinhardt, J. Measuring Massive Multitask Language Understanding

work page

[77] [77]

Wang, Y . et al. MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark. 2024; https://arxiv.org/abs/2406.01574

work page internal anchor Pith review Pith/arXiv arXiv 2024

[78] [78]

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Rein, D.; Hou, B. L.; Stickland, A. C.; Petty, J.; Pang, R. Y .; Dirani, J.; Michael, J.; Bowman, S. R. GPQA: A Graduate-Level Google-Proof Q&A Benchmark. 2023; https://arxiv.org/abs/2311.12022

work page internal anchor Pith review Pith/arXiv arXiv 2023

[79] [79]

Measuring Mathematical Problem Solving With the MATH Dataset

Hendrycks, D.; Burns, C.; Kadavath, S.; Arora, A.; Basart, S.; Tang, E.; Song, D.; Steinhardt, J. Measuring Mathematical Problem Solving With the MATH Dataset. 2021;https://arxiv.org/abs/2103.03874

work page internal anchor Pith review Pith/arXiv arXiv 2021

[80] [80]

Gema, A. P. et al. Are We Done with MMLU? Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (V olume 1: Long Papers). Albuquerque, New Mexico, 2025; pp 5069–5096

work page 2025