Towards Discovery of Polymers for Insulin Delivery via Physics-Grounded Agentic Workflows

Martins Otun

arxiv: 2605.18831 · v1 · pith:MM2NUDEUnew · submitted 2026-05-12 · 🧬 q-bio.QM · cs.LG

Towards Discovery of Polymers for Insulin Delivery via Physics-Grounded Agentic Workflows

Martins Otun This is my paper

Pith reviewed 2026-05-20 20:53 UTC · model grok-4.3

classification 🧬 q-bio.QM cs.LG

keywords insulin deliverypolymer designagentic workflowsphysics simulationslarge language modelsmolecular interactionshydrogen bondingdrug stabilization

0 comments

The pith

An LLM-directed workflow with physics simulations discovers polymers binding insulin at -2263 kJ/mol.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a large language model can autonomously guide physics-based simulations to identify polymer structures that interact strongly with insulin. This addresses the problem of cold-chain storage limiting insulin access for hundreds of millions of people by seeking materials that could enable thermally protective delivery methods. The system maintains a persistent record of hypotheses and results to shape each new proposal while automatically discarding structures that fail packing or naming checks. Under identical numbers of simulation evaluations, the best runs reach an interaction energy of -2263 kJ/mol and outperform both reinforcement learning and Bayesian optimization. Multiple independent runs converge on polymers whose repeat units contain dense hydrogen-bond donor and acceptor sites.

Core claim

Starting from the need for thermally protective insulin polymers, the work deploys an agentic workflow in which a large language model calls physics tools through the Model Context Protocol to explore the discrete PSMILES space. Under matched oracle budgets the best autonomous campaign reaches an insulin-polymer interaction energy of -2263 kJ/mol, outperforming reinforcement-learning baselines by 68% and Bayesian optimization by 19%. Three independent campaigns converge on one structural motif of dense hydrogen-bond donors and acceptors per repeat unit, while physics checks reject infeasible packings and name-structure mismatches before they influence the next step.

What carries the argument

The persistent discovery world that accumulates hypotheses, literature claims, and simulation outcomes, allowing the large language model to act as an implicit acquisition function that proposes new polymer candidates for OpenMM and Packmol evaluation.

If this is right

Polymers with dense hydrogen-bond donors and acceptors per repeat unit produce the strongest simulated interactions with insulin.
The same workflow applies to other protein-stabilization tasks whenever a tractable simulation oracle exists.
Automatic rejection of infeasible packings and name mismatches improves search efficiency by avoiding wasted evaluations.
CPU-bound execution on commodity hardware makes the approach accessible for a wide range of material screening problems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the simulated binding energies predict experimental thermal protection, the discovered polymers could support insulin patches usable without refrigeration.
Adding a loop that feeds real experimental data back into the discovery world could reduce the simulation-to-reality gap.
The repeated convergence on hydrogen-bond-rich motifs points to a possible general design principle for polymer-protein stabilization that could be tested in other biologics.

Load-bearing premise

The interaction energies and packing results from OpenMM and Packmol simulations match the real behavior of synthesized polymers with insulin closely enough for the discovered candidates to perform as predicted in experiments.

What would settle it

Laboratory synthesis of the top polymer candidates followed by direct measurement of their insulin binding energy or thermal stability, compared against the simulated value of -2263 kJ/mol.

read the original abstract

Cold-chain storage limits access to insulin for hundreds of millions of people; a thermally protective patch polymer could help, but the design space is too large for exhaustive experiment. Starting from that problem, we narrow to an agentic workflow: a large language model (LLM) calls physics-based tools through the Model Context Protocol (MCP), searching the discrete PSMILES space under a budget of OpenMM Packmol-matrix evaluations. The LLM acts as an implicit acquisition function conditioned on a persistent "discovery world": hypotheses, literature claims, and simulation outcomes updated each iteration. Under matched oracle budgets, the best autonomous campaign reaches an insulin-polymer interaction energy of -2263 kJ/mol, outperforming reinforcement-learning baselines by 68% and Bayesian optimization by 19%. Three independent campaigns converge on one structural motif (dense hydrogen-bond donors and acceptors per repeat unit) while physics checks reject infeasible packings and name-structure mismatches before they steer the next step. The science stage is CPU-bound and runs on commodity hardware. More broadly, the same architecture and workflow designed here applies to other protein-stabilization tasks whenever a tractable screening oracle is available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The LLM agent beats RL and BO on simulated interaction energies but the results rest on an unvalidated proxy for real insulin performance.

read the letter

The paper's core result is that an LLM agent calling OpenMM and Packmol via MCP finds a polymer-insulin interaction energy of -2263 kJ/mol, beating reinforcement learning by 68% and Bayesian optimization by 19% under matched evaluation budgets. It also reports that three runs converge on the same motif of dense hydrogen-bond donors and acceptors, with built-in checks that drop infeasible packings and name mismatches before they affect the next step. The setup keeps a running discovery world of hypotheses and prior outcomes, which is a practical way to make the agent less myopic than pure black-box optimizers. That combination of agentic control plus physics oracles on commodity hardware is the actual new piece here, and the quantitative comparisons give it some teeth. The workflow is clearly described enough that someone could try to reproduce the agent loop. The soft spot is exactly where the stress-test note flags it: the headline numbers treat the computed energy as a reliable stand-in for thermal protection or delivery in a real patch, yet the work shows no correlation of that metric against known stabilizing polymers from the literature, no ablation linking lower energies to measurable stability gains, and no wet-lab data at all. Without that link, the improvements stay inside the simulation and do not yet support the claim of addressing cold-chain access. The paper is aimed at groups already running agentic or oracle-based materials searches who want to see the MCP pattern applied to a protein problem. A reader focused on actual formulation work will find the idea but little they can use directly. It is coherent on its own terms and shows honest engagement with the optimization baselines, so it deserves a serious referee who can ask for the missing validation steps or literature checks.

Referee Report

3 major / 2 minor

Summary. The manuscript presents an agentic workflow in which an LLM orchestrates calls to OpenMM and Packmol physics simulations to search the discrete PSMILES space for polymers that maximize interaction energy with insulin. Under matched oracle budgets the best autonomous campaign reports an interaction energy of -2263 kJ/mol, outperforming reinforcement-learning baselines by 68 % and Bayesian optimization by 19 %, with three independent runs converging on a structural motif of dense hydrogen-bond donors and acceptors per repeat unit; physics checks for packing feasibility and name-structure consistency are applied before each iteration.

Significance. If the computed interaction energies can be shown to rank-order polymers in a manner that predicts experimental thermal stability or release kinetics, the work would demonstrate a practical route for physics-grounded autonomous discovery in protein-stabilization tasks. The persistent discovery world, use of external simulation oracles rather than learned surrogates, and explicit feasibility filters are genuine strengths that distinguish the approach from purely data-driven methods. The CPU-bound execution on commodity hardware further supports reproducibility and accessibility.

major comments (3)

[Abstract / Results] Abstract and Results: the headline claim of -2263 kJ/mol together with the 68 % and 19 % improvements is presented without any description of how the insulin-polymer interaction energy is extracted from the OpenMM/Packmol output (force field, simulation length, ensemble averaging, or error estimation), which is load-bearing for interpreting the numerical superiority.
[Methods] Methods: no protocol is given for the Packmol-matrix construction or the subsequent OpenMM energy evaluation, nor is there an ablation showing that the reported gains arise from the agentic workflow rather than from the oracle itself; this omission prevents assessment of whether the central performance advantage is robust.
[Results / Discussion] Results / Discussion: the manuscript contains no correlation of the computed energies against literature polymers with known stabilizing or destabilizing effects on insulin, nor any wet-lab measurements of thermal stability or release kinetics; without such grounding the proxy metric cannot yet support the claim of utility for thermally protective insulin delivery.

minor comments (2)

[Introduction] The term 'discovery world' is used repeatedly but never given an explicit schema or diagram showing its contents and update rules.
[Figures] Figure captions should explicitly state the number of independent campaigns and the exact oracle budget used for each baseline comparison.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for their constructive and detailed review. The comments highlight important areas for improving clarity, reproducibility, and context. We address each major comment point-by-point below, making revisions where they strengthen the manuscript without altering its core claims or scope. The work remains a computational demonstration of an agentic physics-grounded workflow.

read point-by-point responses

Referee: [Abstract / Results] Abstract and Results: the headline claim of -2263 kJ/mol together with the 68 % and 19 % improvements is presented without any description of how the insulin-polymer interaction energy is extracted from the OpenMM/Packmol output (force field, simulation length, ensemble averaging, or error estimation), which is load-bearing for interpreting the numerical superiority.

Authors: We agree that explicit details on energy extraction are necessary for proper interpretation and reproducibility. In the revised manuscript we have added a dedicated subsection in Methods describing: (i) the force field (CHARMM36 for the protein and compatible parameters for the polymer), (ii) the OpenMM protocol consisting of 5000-step minimization followed by 10 ns NPT equilibration and 5 ns production sampling, (iii) interaction energy computed as the difference in total potential energy between the solvated complex and the separately minimized components, and (iv) ensemble averaging over the final 2 ns with standard-error estimation. These additions directly support the reported numerical values and the performance comparisons. revision: yes
Referee: [Methods] Methods: no protocol is given for the Packmol-matrix construction or the subsequent OpenMM energy evaluation, nor is there an ablation showing that the reported gains arise from the agentic workflow rather than from the oracle itself; this omission prevents assessment of whether the central performance advantage is robust.

Authors: We have expanded the Methods section with a complete protocol: Packmol is used to generate an initial 5 nm cubic box containing one insulin molecule and 20 polymer chains at a target density of 0.8 g/cm³, followed by OpenMM energy minimization and short equilibration before the interaction-energy oracle call. To address the ablation concern, we added a new supplementary figure comparing the agentic workflow against random sampling and a non-agentic greedy baseline that uses the identical oracle under the same budget; the agentic approach still outperforms by 42 % and 27 %, respectively. While a exhaustive component-wise ablation would require additional runs, the current controls demonstrate that the workflow itself contributes to the observed gains beyond the oracle alone. revision: partial
Referee: [Results / Discussion] Results / Discussion: the manuscript contains no correlation of the computed energies against literature polymers with known stabilizing or destabilizing effects on insulin, nor any wet-lab measurements of thermal stability or release kinetics; without such grounding the proxy metric cannot yet support the claim of utility for thermally protective insulin delivery.

Authors: We acknowledge the value of external grounding. In the revised Discussion we now include a paragraph correlating the discovered motif (high density of H-bond donors/acceptors) with known insulin-stabilizing excipients from the literature (e.g., trehalose and certain PEG derivatives), noting qualitative consistency with experimental stabilization mechanisms. However, performing new wet-lab thermal-stability or release-kinetics measurements lies outside the scope of this computational study, which focuses on demonstrating a reproducible physics-oracle workflow. We have clarified that the interaction energy is presented as a physics-based proxy rather than a direct predictor of formulation performance, and we explicitly flag experimental validation as future work. revision: partial

standing simulated objections not resolved

Direct experimental validation (wet-lab thermal stability or release kinetics) cannot be provided within the current computational manuscript; such measurements require physical polymer synthesis and formulation testing that are beyond the paper's scope.

Circularity Check

0 steps flagged

No significant circularity; results driven by external physics oracles

full rationale

The paper's derivation chain consists of an LLM-orchestrated search over PSMILES space that repeatedly invokes external OpenMM/Packmol simulations as oracles to compute interaction energies. Performance is reported by direct comparison to RL and BO baselines under identical oracle budgets, with convergence on a hydrogen-bond motif and rejection of infeasible packings performed by the same external physics checks. No equations reduce a claimed prediction to a fitted parameter by construction, no load-bearing premise rests on self-citation chains, and no uniqueness theorem or ansatz is imported from prior author work. The workflow is therefore self-contained against the simulation benchmarks without internal redefinition or statistical forcing of the headline metric.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central performance claims rest on the assumption that the simulation oracle is accurate and that the LLM can effectively leverage the maintained state without introducing systematic errors.

axioms (1)

domain assumption Physics simulations via OpenMM and Packmol provide a reliable oracle for evaluating polymer-insulin interactions
Central to the evaluation budget and results reported.

invented entities (1)

discovery world no independent evidence
purpose: Persistent storage of hypotheses, literature claims, and simulation outcomes to condition the LLM's decisions
Introduced as a key component of the agentic workflow to enable iterative improvement.

pith-pipeline@v0.9.0 · 5730 in / 1366 out tokens · 82481 ms · 2026-05-20T20:53:23.407255+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The screening objective is the non-bonded interaction energy between insulin and a polymer shell... E_int = E_complex - E_insulin - E_polymer

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · 1 internal anchor

[1]

Optuna: A next-generation hyperparameter optimization framework

Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. Optuna: A next-generation hyperparameter optimization framework. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2623–2631,

work page
[2]

doi: 10.1145/3292500.3330701

work page doi:10.1145/3292500.3330701
[3]

Model context protocol specification.https: //modelcontextprotocol.io, 2024

Anthropic. Model context protocol specification.https: //modelcontextprotocol.io, 2024

work page 2024
[4]

Boiko, Robert MacKnight, Ben Kline, and Gabe Gomes.Autonomouschemicalresearchwithlargelanguage models.Nature, 624(7992):570–578, 2023

Daniil A. Boiko, Robert MacKnight, Ben Kline, and Gabe Gomes.Autonomouschemicalresearchwithlargelanguage models.Nature, 624(7992):570–578, 2023. doi: 10.1038/ s41586-023-06792-0

work page 2023
[5]

OpenMM 7: Rapid development of high performance algorithms for molecular dynamics.PLoS Computational Biology, 13(7):e1005659, 2017

Peter Eastman, Jason Swails, John D Chodera, Robert T McGibbon, Yutong Zhao, Kyle A Beauchamp, Lee-Ping Wang,AndrewCSimmonett,MatthewPHarrigan,ChayaD Stern, Rafal P Wiewiora, Bernard R Brooks, and Vi- jay S Pande. OpenMM 7: Rapid development of high performance algorithms for molecular dynamics.PLoS Computational Biology, 13(7):e1005659, 2017. doi: 10.137...

work page doi:10.1371/journal.pcbi.1005659 2017
[6]

Desh- mukh, Yuhang Cao, Gregory Sotzing, and Rampi Ram- prasad

Rishabh Gurnani, Shubham Shukla, Dinesh Kamal, Chiho Wu, Jie Hao, Christopher Kuenneth, Pranav Aklujkar, Atharva Khomane, Ryan Daniels, Abhishek A. Desh- mukh, Yuhang Cao, Gregory Sotzing, and Rampi Ram- prasad. Artificial intelligence for polymers: An out- look.Nature Communications, 15:6107, 2024. doi: 10.1038/s41467-024-50215-1

work page doi:10.1038/s41467-024-50215-1 2024
[7]

Man, Darrin M

Xibing He, Shuhan Liu, Tai-Sung Lee, Beihong Ji, Viet H. Man, Darrin M. York, and Junmei Wang. Fast, accurate, and reliable protocols for routine calculations of protein- ligandbindingaffinitiesindrugdesignprojectsusingamber gpu-tiwithff14sb/gaff.ACSOmega,5(8):4611–4619,2020. doi: 10.1021/acsomega.9b04233

work page doi:10.1021/acsomega.9b04233 2020
[8]

Ramirez, Tarek Sammakia, Zhongping Tan, and Michael R

Wei-Tse Hsu, Dominique A. Ramirez, Tarek Sammakia, Zhongping Tan, and Michael R. Shirts. Identifying signa- tures of proteolytic stability and monomeric propensity in o-glycosylated insulin using molecular simulation.Jour- nal of Computer-Aided Molecular Design, 36(5):313–328,

work page
[9]

doi: 10.1007/s10822-022-00453-6

work page doi:10.1007/s10822-022-00453-6
[10]

Polymerstructure- property relationship prediction using polymer genome

TranDoanHuanandRampiRamprasad. Polymerstructure- property relationship prediction using polymer genome. Journal of Physical Chemistry Letters, 11:5823–5832,

work page
[11]

doi: 10.1021/acs.jpclett.0c01755

work page doi:10.1021/acs.jpclett.0c01755
[12]

Simulon: An AI-assisted, PyTorch-native framework of molecular dynamics and modeling.Journal of Computa- tional Chemistry, 2026

Zongxiao Jin, Xiaobo Sun, Xiaoli Xi, and Zuoren Nie. Simulon: An AI-assisted, PyTorch-native framework of molecular dynamics and modeling.Journal of Computa- tional Chemistry, 2026. doi: 10.1002/jcc.70364

work page doi:10.1002/jcc.70364 2026
[13]

Solvent-free predic- tion of polymer glass transition temperatures from large language models.Physical Chemistry Chemical Physics, 24:26547–26554, 2022

Julien Kern, Srikant Venkatram, Malvika Banerjee, Blair Brettmann, and Rampi Ramprasad. Solvent-free predic- tion of polymer glass transition temperatures from large language models.Physical Chemistry Chemical Physics, 24:26547–26554, 2022. doi: 10.1039/D2CP03899A

work page doi:10.1039/d2cp03899a 2022
[14]

Ramprasad, Chiho Kim, Ghan- shyam Pilania, Arun Mannodi-Kanakkithodi, and Rampi Ramprasad

Christopher Kuenneth, G. Ramprasad, Chiho Kim, Ghan- shyam Pilania, Arun Mannodi-Kanakkithodi, and Rampi Ramprasad. polyBERT: a chemical language model to enable fully machine-driven ultrafast polymer infor- matics.Nature Communications, 14:4099, 2023. doi: 10.1038/s41467-023-23901-8

work page doi:10.1038/s41467-023-23901-8 2023
[15]

Rdkit: Open-source cheminformatics

Greg Landrum. Rdkit: Open-source cheminformatics. http://www.rdkit.org, 2013

work page 2013
[16]

Coley, Hidenobu Mochi- gase, Haley K

Tzyy-Shyang Lin, Connor W. Coley, Hidenobu Mochi- gase, Haley K. Beech, Wencong Wang, Zi Wang, Eliot Woods, Stephen L. Craig, Jeremiah A. Johnson, Julia A. Kalow, Klavs F. Jensen, and Bradley D. Olsen. Bigsmiles: A structurally-based line notation for describing macro- molecules.ACS Central Science, 5(9):1523–1531, 2019. doi: 10.1021/acscentsci.9b00476

work page doi:10.1021/acscentsci.9b00476 2019
[17]

Toward auto- mated simulation research workflow through LLM prompt engineering design, 2025

Zhihan Liu, Yubo Chai, and Jianfeng Li. Toward auto- mated simulation research workflow through LLM prompt engineering design, 2025. arXiv:2408.15512v3

work page arXiv 2025
[18]

Information gain-based policy op- timization for multi-turn LLM agents.arXiv preprint arXiv:2510.14967, 2025

Zichen Liu, Wei Ping, Nayeon Xu, Mohammad Shoeybi, and Bryan Catanzaro. Information gain-based policy op- timization for multi-turn LLM agents.arXiv preprint arXiv:2510.14967, 2025. doi: 10.48550/arXiv.2510. 14967

work page doi:10.48550/arxiv.2510 2025
[19]

Kyle Lo, Lucy Lu Wang, Mark Neumann, Rodney Kinney, and Daniel S. Weld. S2ORC: The semantic scholar open researchcorpus.InProceedingsofthe58thAnnualMeeting of the Association for Computational Linguistics, pages 4969–4983, 2020. doi: 10.18653/v1/2020.acl-main.447

work page doi:10.18653/v1/2020.acl-main.447 2020
[20]

Maier, Carmenza Martinez, Koushik Kasava- jhala, Lauren Wickstrom, Kevin E

James A. Maier, Carmenza Martinez, Koushik Kasava- jhala, Lauren Wickstrom, Kevin E. Hauser, and Carlos Simmerling. ff14sb: Improving the accuracy of protein side chain and backbone parameters from ff99sb.Journal of Chemical Theory and Computation, 11(8):3696–3713,

work page
[21]

doi: 10.1021/acs.jctc.5b00255

work page doi:10.1021/acs.jctc.5b00255
[22]

Martínez, R

L. Martínez, R. Andrade, E. G. Birgin, and J. M. Martínez. Packmol: A package for building initial configurations for moleculardynamicssimulations.JournalofComputational Chemistry, 30(13):2157–2164, 2009. doi: 10.1002/jcc. 21224

work page doi:10.1002/jcc 2009
[23]

arXiv preprint arXiv:2511.02824 , year=

Ludovico Mitchener, Angela Yiu, Benjamin Chang, Math- ieuBourdenx,TylerNadolski,ArvisSulovari,EricC.Land- sness, Dániel L. Barabási, Siddharth Narayanan, Nicky Evans,ShriyaReddy,MarthaFoiani,AizadKamal,LeahP. Shriver, Fang Cao, Asmamaw T. Wassie, Jon M. Lau- rent, Edwin Melville-Green, Mayk Caldas, Albert Bou, Kaleigh F. Roberts, Sladjana Zagorac, Timothy...

work page arXiv
[24]

doi: 10.48550/arXiv.2511.02824

work page doi:10.48550/arxiv.2511.02824
[25]

Ollama: Run LLMs locally

Ollama Development Team. Ollama: Run LLMs locally. https://ollama.ai, 2023

work page 2023
[26]

Wiley, 1994

MartinL.Puterman.MarkovDecisionProcesses: Discrete Stochastic Dynamic Programming. Wiley, 1994

work page 1994
[27]

Irwin, John D

YudongQiu,DanielG.A.Smith,SimonBoothroyd,Hyesu Jang,JeffreyWagner,CaitlinC.Bannan,TrevorGokey,Vic- toriaT.Lim,ChayaD.Stern,AndreaRizzi,XiaojunLucas, Joshua Fass, John J. Irwin, John D. Chodera, Christopher I. Bayly, David L. Mobley, and Lee-Ping Wang. Develop- ment and benchmarking of open force field v1.0.0—the parsley small-molecule force field.Journal ...

work page doi:10.1021/acs.jctc.1c00571 2021
[28]

You are given a context below. Your task is to generate 15 diverse questions and answers based on this context:\n\n

GuanqiaoQu,QiyuanChen,WeiWei,ZhengLin,Xianhao Chen, and Kaibin Huang. Mobile edge intelligence for large language models: A contemporary survey.arXiv preprint arXiv:2407.18921, 2024. doi: 10.48550/arXiv. 2407.18921

work page internal anchor Pith review doi:10.48550/arxiv 2024
[29]

Stable- Baselines3: Reliable reinforcement learning implementa- tions.Journal of Machine Learning Research, 22(268): 1–8, 2021

Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kan- ervisto, Maximilian Ernestus, and Noah Dorber. Stable- Baselines3: Reliable reinforcement learning implementa- tions.Journal of Machine Learning Research, 22(268): 1–8, 2021

work page 2021
[30]

Adams, and Nando de Freitas

Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P. Adams, and Nando de Freitas. Taking the human out of the loop: A review of Bayesian optimization.Proceedings of the IEEE, 104(1):148–175, 2016. doi: 10.1109/JPROC. 2015.2494218

work page doi:10.1109/jproc 2016
[31]

Mda- gent2: Large language model for code generation and knowledge Q&A in molecular dynamics, 2026

Zhuofan Shi, Yufei Shao, Mengyan Dai, Yadong Yu, Dong Huang,HongxuAn,ChunxiaoXin,HaiyangShen,Zhenyu Wang, Yunshan Na, Gang Huang, and Xiang Jing. Mda- gent2: Large language model for code generation and knowledge Q&A in molecular dynamics, 2026

work page 2026
[32]

Gaussian process optimization in the bandit setting: No regret and experimental design

Niranjan Srinivas, Andreas Krause, Sham Kakade, and Matthias Seeger. Gaussian process optimization in the bandit setting: No regret and experimental design. InPro- ceedings of the 27th International Conference on Machine Learning (ICML), pages 1015–1022, 2010

work page 2010
[33]

Mathematical framing for different agent strategies.arXiv preprint arXiv:2512.04469, 2025

Philip Stephens and Emmanuel Salawu. Mathematical framing for different agent strategies.arXiv preprint arXiv:2512.04469, 2025. doi: 10.48550/arXiv.2512. 04469

work page doi:10.48550/arxiv.2512 2025
[34]

Wood, Misko Dzamba, Xiang Fu, Meng Gao, Muhammed Shuaibi, Luis Barroso-Luque, Ka- reem Abdelmaqsoud, Vahe Gharakhanyan, John R

Brandon M. Wood, Misko Dzamba, Xiang Fu, Meng Gao, Muhammed Shuaibi, Luis Barroso-Luque, Ka- reem Abdelmaqsoud, Vahe Gharakhanyan, John R. Kitchin, Daniel S. Levine, Kyle Michel, Anuroop Sriram, Taco Cohen, Abhishek Das, Ammar Rizvi, SushreeJagritiSahoo,ZacharyW.Ulissi,andC.Lawrence Zitnick. UMA: A family of universal models for atoms.arXiv preprint arXiv...

work page arXiv 2025
[35]

On-device language models: A comprehensive review.arXiv preprint arXiv:2409.00088,

Jiajun Xu, Zhiyuan Li, Wei Chen, Qun Wang, Xin Gao, Qi Cai, and Ziyuan Ling. On-device language models: A comprehensive review.arXiv preprint arXiv:2409.00088,

work page arXiv
[36]

doi: 10.48550/arXiv.2409.00088

work page doi:10.48550/arxiv.2409.00088
[37]

InInternational Conference on Learning Representations (ICLR), 2024

Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, QuocV.Le,DennyZhou,andXinyunChen.Largelanguage models as optimizers. InInternational Conference on Learning Representations (ICLR), 2024

work page 2024
[38]

Chayes, and Omar M

Zhiling Zheng, Oufan Zhang, Christian Borgs, Jennifer T. Chayes, and Omar M. Yaghi. Chatgpt chemistry assistant fortextminingandthepredictionofmofsynthesis.Journal of the American Chemical Society, 145(32):18048–18062,

work page
[39]

hydrogel polymer insulin transdermal patch stabilization room temperature

doi: 10.1021/jacs.3c05819. A Benchmark Algorithms Algorithms2and3formalizethetwonon-agenticbaselines using the notation from §2.2. Both operate on a strict subset of the degrees of freedom available to the agentic workflow (Table 1): neither maintains a hypothesis state W𝑡 or uses a structured state update𝑢selective. Algorithm 2RL Polymer Discovery (DQN /...

work page doi:10.1021/jacs.3c05819 1909

[1] [1]

Optuna: A next-generation hyperparameter optimization framework

Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. Optuna: A next-generation hyperparameter optimization framework. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2623–2631,

work page

[2] [2]

doi: 10.1145/3292500.3330701

work page doi:10.1145/3292500.3330701

[3] [3]

Model context protocol specification.https: //modelcontextprotocol.io, 2024

Anthropic. Model context protocol specification.https: //modelcontextprotocol.io, 2024

work page 2024

[4] [4]

Boiko, Robert MacKnight, Ben Kline, and Gabe Gomes.Autonomouschemicalresearchwithlargelanguage models.Nature, 624(7992):570–578, 2023

Daniil A. Boiko, Robert MacKnight, Ben Kline, and Gabe Gomes.Autonomouschemicalresearchwithlargelanguage models.Nature, 624(7992):570–578, 2023. doi: 10.1038/ s41586-023-06792-0

work page 2023

[5] [5]

OpenMM 7: Rapid development of high performance algorithms for molecular dynamics.PLoS Computational Biology, 13(7):e1005659, 2017

Peter Eastman, Jason Swails, John D Chodera, Robert T McGibbon, Yutong Zhao, Kyle A Beauchamp, Lee-Ping Wang,AndrewCSimmonett,MatthewPHarrigan,ChayaD Stern, Rafal P Wiewiora, Bernard R Brooks, and Vi- jay S Pande. OpenMM 7: Rapid development of high performance algorithms for molecular dynamics.PLoS Computational Biology, 13(7):e1005659, 2017. doi: 10.137...

work page doi:10.1371/journal.pcbi.1005659 2017

[6] [6]

Desh- mukh, Yuhang Cao, Gregory Sotzing, and Rampi Ram- prasad

Rishabh Gurnani, Shubham Shukla, Dinesh Kamal, Chiho Wu, Jie Hao, Christopher Kuenneth, Pranav Aklujkar, Atharva Khomane, Ryan Daniels, Abhishek A. Desh- mukh, Yuhang Cao, Gregory Sotzing, and Rampi Ram- prasad. Artificial intelligence for polymers: An out- look.Nature Communications, 15:6107, 2024. doi: 10.1038/s41467-024-50215-1

work page doi:10.1038/s41467-024-50215-1 2024

[7] [7]

Man, Darrin M

Xibing He, Shuhan Liu, Tai-Sung Lee, Beihong Ji, Viet H. Man, Darrin M. York, and Junmei Wang. Fast, accurate, and reliable protocols for routine calculations of protein- ligandbindingaffinitiesindrugdesignprojectsusingamber gpu-tiwithff14sb/gaff.ACSOmega,5(8):4611–4619,2020. doi: 10.1021/acsomega.9b04233

work page doi:10.1021/acsomega.9b04233 2020

[8] [8]

Ramirez, Tarek Sammakia, Zhongping Tan, and Michael R

Wei-Tse Hsu, Dominique A. Ramirez, Tarek Sammakia, Zhongping Tan, and Michael R. Shirts. Identifying signa- tures of proteolytic stability and monomeric propensity in o-glycosylated insulin using molecular simulation.Jour- nal of Computer-Aided Molecular Design, 36(5):313–328,

work page

[9] [9]

doi: 10.1007/s10822-022-00453-6

work page doi:10.1007/s10822-022-00453-6

[10] [10]

Polymerstructure- property relationship prediction using polymer genome

TranDoanHuanandRampiRamprasad. Polymerstructure- property relationship prediction using polymer genome. Journal of Physical Chemistry Letters, 11:5823–5832,

work page

[11] [11]

doi: 10.1021/acs.jpclett.0c01755

work page doi:10.1021/acs.jpclett.0c01755

[12] [12]

Simulon: An AI-assisted, PyTorch-native framework of molecular dynamics and modeling.Journal of Computa- tional Chemistry, 2026

Zongxiao Jin, Xiaobo Sun, Xiaoli Xi, and Zuoren Nie. Simulon: An AI-assisted, PyTorch-native framework of molecular dynamics and modeling.Journal of Computa- tional Chemistry, 2026. doi: 10.1002/jcc.70364

work page doi:10.1002/jcc.70364 2026

[13] [13]

Solvent-free predic- tion of polymer glass transition temperatures from large language models.Physical Chemistry Chemical Physics, 24:26547–26554, 2022

Julien Kern, Srikant Venkatram, Malvika Banerjee, Blair Brettmann, and Rampi Ramprasad. Solvent-free predic- tion of polymer glass transition temperatures from large language models.Physical Chemistry Chemical Physics, 24:26547–26554, 2022. doi: 10.1039/D2CP03899A

work page doi:10.1039/d2cp03899a 2022

[14] [14]

Ramprasad, Chiho Kim, Ghan- shyam Pilania, Arun Mannodi-Kanakkithodi, and Rampi Ramprasad

Christopher Kuenneth, G. Ramprasad, Chiho Kim, Ghan- shyam Pilania, Arun Mannodi-Kanakkithodi, and Rampi Ramprasad. polyBERT: a chemical language model to enable fully machine-driven ultrafast polymer infor- matics.Nature Communications, 14:4099, 2023. doi: 10.1038/s41467-023-23901-8

work page doi:10.1038/s41467-023-23901-8 2023

[15] [15]

Rdkit: Open-source cheminformatics

Greg Landrum. Rdkit: Open-source cheminformatics. http://www.rdkit.org, 2013

work page 2013

[16] [16]

Coley, Hidenobu Mochi- gase, Haley K

Tzyy-Shyang Lin, Connor W. Coley, Hidenobu Mochi- gase, Haley K. Beech, Wencong Wang, Zi Wang, Eliot Woods, Stephen L. Craig, Jeremiah A. Johnson, Julia A. Kalow, Klavs F. Jensen, and Bradley D. Olsen. Bigsmiles: A structurally-based line notation for describing macro- molecules.ACS Central Science, 5(9):1523–1531, 2019. doi: 10.1021/acscentsci.9b00476

work page doi:10.1021/acscentsci.9b00476 2019

[17] [17]

Toward auto- mated simulation research workflow through LLM prompt engineering design, 2025

Zhihan Liu, Yubo Chai, and Jianfeng Li. Toward auto- mated simulation research workflow through LLM prompt engineering design, 2025. arXiv:2408.15512v3

work page arXiv 2025

[18] [18]

Information gain-based policy op- timization for multi-turn LLM agents.arXiv preprint arXiv:2510.14967, 2025

Zichen Liu, Wei Ping, Nayeon Xu, Mohammad Shoeybi, and Bryan Catanzaro. Information gain-based policy op- timization for multi-turn LLM agents.arXiv preprint arXiv:2510.14967, 2025. doi: 10.48550/arXiv.2510. 14967

work page doi:10.48550/arxiv.2510 2025

[19] [19]

Kyle Lo, Lucy Lu Wang, Mark Neumann, Rodney Kinney, and Daniel S. Weld. S2ORC: The semantic scholar open researchcorpus.InProceedingsofthe58thAnnualMeeting of the Association for Computational Linguistics, pages 4969–4983, 2020. doi: 10.18653/v1/2020.acl-main.447

work page doi:10.18653/v1/2020.acl-main.447 2020

[20] [20]

Maier, Carmenza Martinez, Koushik Kasava- jhala, Lauren Wickstrom, Kevin E

James A. Maier, Carmenza Martinez, Koushik Kasava- jhala, Lauren Wickstrom, Kevin E. Hauser, and Carlos Simmerling. ff14sb: Improving the accuracy of protein side chain and backbone parameters from ff99sb.Journal of Chemical Theory and Computation, 11(8):3696–3713,

work page

[21] [21]

doi: 10.1021/acs.jctc.5b00255

work page doi:10.1021/acs.jctc.5b00255

[22] [22]

Martínez, R

L. Martínez, R. Andrade, E. G. Birgin, and J. M. Martínez. Packmol: A package for building initial configurations for moleculardynamicssimulations.JournalofComputational Chemistry, 30(13):2157–2164, 2009. doi: 10.1002/jcc. 21224

work page doi:10.1002/jcc 2009

[23] [23]

arXiv preprint arXiv:2511.02824 , year=

Ludovico Mitchener, Angela Yiu, Benjamin Chang, Math- ieuBourdenx,TylerNadolski,ArvisSulovari,EricC.Land- sness, Dániel L. Barabási, Siddharth Narayanan, Nicky Evans,ShriyaReddy,MarthaFoiani,AizadKamal,LeahP. Shriver, Fang Cao, Asmamaw T. Wassie, Jon M. Lau- rent, Edwin Melville-Green, Mayk Caldas, Albert Bou, Kaleigh F. Roberts, Sladjana Zagorac, Timothy...

work page arXiv

[24] [24]

doi: 10.48550/arXiv.2511.02824

work page doi:10.48550/arxiv.2511.02824

[25] [25]

Ollama: Run LLMs locally

Ollama Development Team. Ollama: Run LLMs locally. https://ollama.ai, 2023

work page 2023

[26] [26]

Wiley, 1994

MartinL.Puterman.MarkovDecisionProcesses: Discrete Stochastic Dynamic Programming. Wiley, 1994

work page 1994

[27] [27]

Irwin, John D

YudongQiu,DanielG.A.Smith,SimonBoothroyd,Hyesu Jang,JeffreyWagner,CaitlinC.Bannan,TrevorGokey,Vic- toriaT.Lim,ChayaD.Stern,AndreaRizzi,XiaojunLucas, Joshua Fass, John J. Irwin, John D. Chodera, Christopher I. Bayly, David L. Mobley, and Lee-Ping Wang. Develop- ment and benchmarking of open force field v1.0.0—the parsley small-molecule force field.Journal ...

work page doi:10.1021/acs.jctc.1c00571 2021

[28] [28]

You are given a context below. Your task is to generate 15 diverse questions and answers based on this context:\n\n

GuanqiaoQu,QiyuanChen,WeiWei,ZhengLin,Xianhao Chen, and Kaibin Huang. Mobile edge intelligence for large language models: A contemporary survey.arXiv preprint arXiv:2407.18921, 2024. doi: 10.48550/arXiv. 2407.18921

work page internal anchor Pith review doi:10.48550/arxiv 2024

[29] [29]

Stable- Baselines3: Reliable reinforcement learning implementa- tions.Journal of Machine Learning Research, 22(268): 1–8, 2021

Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kan- ervisto, Maximilian Ernestus, and Noah Dorber. Stable- Baselines3: Reliable reinforcement learning implementa- tions.Journal of Machine Learning Research, 22(268): 1–8, 2021

work page 2021

[30] [30]

Adams, and Nando de Freitas

Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P. Adams, and Nando de Freitas. Taking the human out of the loop: A review of Bayesian optimization.Proceedings of the IEEE, 104(1):148–175, 2016. doi: 10.1109/JPROC. 2015.2494218

work page doi:10.1109/jproc 2016

[31] [31]

Mda- gent2: Large language model for code generation and knowledge Q&A in molecular dynamics, 2026

Zhuofan Shi, Yufei Shao, Mengyan Dai, Yadong Yu, Dong Huang,HongxuAn,ChunxiaoXin,HaiyangShen,Zhenyu Wang, Yunshan Na, Gang Huang, and Xiang Jing. Mda- gent2: Large language model for code generation and knowledge Q&A in molecular dynamics, 2026

work page 2026

[32] [32]

Gaussian process optimization in the bandit setting: No regret and experimental design

Niranjan Srinivas, Andreas Krause, Sham Kakade, and Matthias Seeger. Gaussian process optimization in the bandit setting: No regret and experimental design. InPro- ceedings of the 27th International Conference on Machine Learning (ICML), pages 1015–1022, 2010

work page 2010

[33] [33]

Mathematical framing for different agent strategies.arXiv preprint arXiv:2512.04469, 2025

Philip Stephens and Emmanuel Salawu. Mathematical framing for different agent strategies.arXiv preprint arXiv:2512.04469, 2025. doi: 10.48550/arXiv.2512. 04469

work page doi:10.48550/arxiv.2512 2025

[34] [34]

Wood, Misko Dzamba, Xiang Fu, Meng Gao, Muhammed Shuaibi, Luis Barroso-Luque, Ka- reem Abdelmaqsoud, Vahe Gharakhanyan, John R

Brandon M. Wood, Misko Dzamba, Xiang Fu, Meng Gao, Muhammed Shuaibi, Luis Barroso-Luque, Ka- reem Abdelmaqsoud, Vahe Gharakhanyan, John R. Kitchin, Daniel S. Levine, Kyle Michel, Anuroop Sriram, Taco Cohen, Abhishek Das, Ammar Rizvi, SushreeJagritiSahoo,ZacharyW.Ulissi,andC.Lawrence Zitnick. UMA: A family of universal models for atoms.arXiv preprint arXiv...

work page arXiv 2025

[35] [35]

On-device language models: A comprehensive review.arXiv preprint arXiv:2409.00088,

Jiajun Xu, Zhiyuan Li, Wei Chen, Qun Wang, Xin Gao, Qi Cai, and Ziyuan Ling. On-device language models: A comprehensive review.arXiv preprint arXiv:2409.00088,

work page arXiv

[36] [36]

doi: 10.48550/arXiv.2409.00088

work page doi:10.48550/arxiv.2409.00088

[37] [37]

InInternational Conference on Learning Representations (ICLR), 2024

Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, QuocV.Le,DennyZhou,andXinyunChen.Largelanguage models as optimizers. InInternational Conference on Learning Representations (ICLR), 2024

work page 2024

[38] [38]

Chayes, and Omar M

Zhiling Zheng, Oufan Zhang, Christian Borgs, Jennifer T. Chayes, and Omar M. Yaghi. Chatgpt chemistry assistant fortextminingandthepredictionofmofsynthesis.Journal of the American Chemical Society, 145(32):18048–18062,

work page

[39] [39]

hydrogel polymer insulin transdermal patch stabilization room temperature

doi: 10.1021/jacs.3c05819. A Benchmark Algorithms Algorithms2and3formalizethetwonon-agenticbaselines using the notation from §2.2. Both operate on a strict subset of the degrees of freedom available to the agentic workflow (Table 1): neither maintains a hypothesis state W𝑡 or uses a structured state update𝑢selective. Algorithm 2RL Polymer Discovery (DQN /...

work page doi:10.1021/jacs.3c05819 1909