DynaMate2: Democratization of Agentic AI for Expert-Designed Custom Workflows

Ajay Vallabh; Orlando A. Mendible-Barreto; Ubaldo M. C\'ordova-Figueroa; Yamil J. Col\'on

arxiv: 2605.20819 · v1 · pith:SWGSKXF3new · submitted 2026-05-20 · ⚛️ physics.chem-ph

DynaMate2: Democratization of Agentic AI for Expert-Designed Custom Workflows

Orlando A. Mendible-Barreto , Ajay Vallabh , Ubaldo M. C\'ordova-Figueroa , Yamil J. Col\'on This is my paper

Pith reviewed 2026-05-21 02:46 UTC · model grok-4.3

classification ⚛️ physics.chem-ph

keywords agentic AIscientific workflowstool registrationmulti-agent systemscomputational chemistrymolecular dynamicsLLM routingcustom Python tools

0 comments

The pith

DynaMate2 lets researchers register their existing Python functions as tools that AI agents can call in scientific workflows.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces DynaMate2 as a way for scientists to bring their specialized, validated code into an automated multi-agent system. Instead of asking the language model to write scientific code, the design keeps all domain logic inside the researcher's own functions while the AI only decides the sequence and choice of tools. This setup aims to let more researchers automate complex workflows like those in computational chemistry without learning new programming for each project. A protocol shows how to add tools from code snippets, files, or descriptions, with changes saved automatically for later use. The approach is shown working on a full molecular dynamics example that chains preparation, simulation, and analysis steps.

Core claim

DynaMate2 is a hierarchical agentic framework whose central feature is the conversion of expert-defined Python functions into AI-callable tools inside a supervised multi-agent pipeline. The LLM is restricted to routing tasks and selecting the right tool from the registered set; it never generates or modifies scientific code. Tools and agents can be added at runtime from inline code, existing files, or plain-language descriptions, and every registration persists across sessions without extra effort. The framework is presented as an open-source template that includes a step-by-step Tool Registration Protocol and a web interface, demonstrated on an end-to-end molecular dynamics workflow.

What carries the argument

The runtime tool registration mechanism combined with a supervised multi-agent pipeline that limits the LLM to task routing and tool selection among expert functions.

Load-bearing premise

That the language model can correctly read tool descriptions and choose the right sequence of interdependent steps without errors that force the user to step in and debug.

What would settle it

Give the system a molecular dynamics workflow containing two tools with overlapping descriptions and measure whether it selects and chains them correctly on the first attempt or requires repeated human corrections.

read the original abstract

Scientific workflows in computational chemistry and materials science typically involve multiple interdependent steps, such as model preparation, system construction, simulation execution, and data analysis, that researchers have refined over the years into highly specialized, validated codebases. While large language model (LLM) agent frameworks have demonstrated the potential to automate such workflows, existing systems are built for specific, pre-defined task sequences. Adapting them to new domains or integrating custom expert-developed tools requires substantial programming expertise, which limits their adoption across the broader scientific community. Here we present DynaMate2, a hierarchical agentic framework and open-source template whose central design goal is to lower the barrier for any researcher to convert their existing expert-defined Python functions into AI-callable tools within a supervised multi-agent pipeline. The key design principle is that the LLM is never asked to generate scientific code since all domain logic resides in expert-defined tools. The LLMs sole responsibility is to route tasks, select the appropriate tool, and use outputs to guide subsequent actions. Tools and agents can be registered at runtime from inline code, existing source files, or plain-language descriptions, and all extensions persist automatically across sessions. We demonstrate the framework through an end-to-end molecular dynamics workflow. We provide a Tool Registration Protocol that guides researchers step-by-step through the process of integrating their validated code into the framework. DynaMate2 is released as an open-source reference implementation with a web-based interface and is designed to serve as a reusable template for community-driven extension across arbitrary scientific domains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DynaMate2 gives a clean registration protocol for wrapping existing Python tools into LLM-routed agents without the model writing any science code, but the single demo leaves routing reliability untested.

read the letter

The main thing to know is that DynaMate2 is a hierarchical agent framework built so researchers can turn their own validated Python functions into callable tools, with the LLM limited to routing and selection rather than generating any domain code. Runtime registration from inline snippets, source files, or plain descriptions, plus automatic persistence, is the practical addition here. They lay out a step-by-step Tool Registration Protocol that could help computational chemists integrate existing codebases without starting over, and the open-source release with a web interface plus the molecular dynamics walkthrough shows the pieces working end to end. That design choice to keep all scientific logic inside expert tools is a solid constraint that avoids common LLM hallucination risks in technical domains. The paper does a straightforward job describing how the multi-agent pipeline is meant to handle interdependent steps like model prep to simulation to analysis. On the soft spots, the abstract only reports one demonstration workflow and gives no numbers on routing success rates, error frequencies, or how often users would need to step in when the LLM misreads tool descriptions or dependencies. Without those metrics or failure-mode breakdowns, the claim that this lowers barriers for any researcher rests on an assumption that still needs checking. This is aimed at computational scientists who already have working Python tools and want a reusable template for adding supervised automation. A reader looking for an open starting point in agentic scientific workflows would get concrete value from the protocol and code structure. It deserves a serious referee because the framework is grounded in real workflow pain points and ships as usable open source rather than just an idea.

Referee Report

2 major / 2 minor

Summary. The manuscript presents DynaMate2, a hierarchical agentic framework and open-source template for computational chemistry and materials science workflows. Its core design allows researchers to register existing expert-defined Python functions as AI-callable tools via a Tool Registration Protocol, with LLMs restricted to task routing, tool selection, and output-guided actions rather than code generation. Tools and agents can be added at runtime from code, files, or descriptions, with automatic persistence. The framework is illustrated through a single end-to-end molecular dynamics demonstration and released with a web-based interface as a reusable template for community extension.

Significance. If the usability and reliability claims are substantiated, DynaMate2 could meaningfully lower barriers to agentic automation in scientific domains by leveraging pre-validated expert codebases instead of requiring LLMs to synthesize domain logic. The open-source release, runtime registration mechanism, and explicit Tool Registration Protocol are concrete strengths that support reproducibility and extension. The supervised multi-agent structure with domain logic isolated in Python functions aligns with best practices for reducing hallucination risks in scientific applications.

major comments (2)

[Demonstration section (end-to-end MD workflow)] The end-to-end molecular dynamics demonstration lacks any quantitative evaluation of routing reliability, such as success rates, error frequencies, or failure-mode analysis across interdependent steps (e.g., model preparation to simulation to analysis). This directly undermines the central claim that LLMs can reliably interpret tool descriptions and route tasks without substantial user intervention or debugging, as only a single qualitative workflow is shown.
[Introduction and framework description] No baseline comparisons or ablation studies are provided to quantify the advantage of the hierarchical registration protocol over existing agent frameworks or direct LLM tool-calling approaches. Without such metrics, it is difficult to assess whether the framework achieves its stated goal of democratizing access for researchers without programming expertise.

minor comments (2)

[Abstract] The abstract states that 'all extensions persist automatically across sessions' but does not clarify the underlying storage mechanism or any limitations on session state in the web interface.
[Figures] Figure captions and workflow diagrams would benefit from explicit labeling of agent hierarchy levels and tool registration points to improve readability for readers unfamiliar with agentic systems.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed review. We address each major comment below and indicate the revisions we will make to strengthen the manuscript.

read point-by-point responses

Referee: The end-to-end molecular dynamics demonstration lacks any quantitative evaluation of routing reliability, such as success rates, error frequencies, or failure-mode analysis across interdependent steps (e.g., model preparation to simulation to analysis). This directly undermines the central claim that LLMs can reliably interpret tool descriptions and route tasks without substantial user intervention or debugging, as only a single qualitative workflow is shown.

Authors: We agree that a single qualitative demonstration is insufficient to fully substantiate claims of routing reliability. The current example was chosen to illustrate the end-to-end integration of expert-defined tools rather than to serve as a statistical benchmark. In the revised manuscript we will expand the Demonstration section with repeated executions of the workflow, reporting success rates for task routing across the interdependent steps, frequencies of routing errors, and a brief failure-mode analysis. These additions will directly support the claim that the restricted LLM role (routing and tool selection only) enables reliable operation with minimal intervention. revision: yes
Referee: No baseline comparisons or ablation studies are provided to quantify the advantage of the hierarchical registration protocol over existing agent frameworks or direct LLM tool-calling approaches. Without such metrics, it is difficult to assess whether the framework achieves its stated goal of democratizing access for researchers without programming expertise.

Authors: We acknowledge the value of quantitative comparisons. However, the primary contribution of DynaMate2 lies in the Tool Registration Protocol and the runtime, description-driven registration mechanism that allows non-programmers to integrate validated expert code without modifying agent logic. In the revised manuscript we will add a dedicated comparison subsection that contrasts the hierarchical registration approach with standard tool-calling interfaces in frameworks such as LangChain and AutoGen, focusing on the differences in persistence, description-based addition, and isolation of domain logic. Full ablation studies involving large-scale user trials are beyond the scope of this framework paper but will be noted as planned future work. revision: partial

Circularity Check

0 steps flagged

No circularity: framework description with no derivations or self-referential predictions

full rationale

The paper presents DynaMate2 as an open-source software framework and Tool Registration Protocol for converting expert-defined Python functions into AI-callable tools in a multi-agent pipeline. The LLM's role is limited to routing and tool selection while domain logic remains in user-provided code. No mathematical equations, fitted parameters, predictions, or first-principles derivations appear in the abstract or described content. The end-to-end MD workflow demonstration serves as an illustrative example rather than a statistically forced output or self-referential result. The design is self-contained as a reusable template without load-bearing self-citations or reductions to prior inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The paper contributes a new software architecture and protocol rather than deriving physical laws or fitting parameters; the main dependencies are assumptions about LLM capabilities for task routing.

axioms (1)

domain assumption Large language models can reliably interpret tool descriptions and perform task routing in multi-step scientific workflows
This assumption underpins the claim that the LLM's sole responsibility is routing and tool selection.

invented entities (1)

DynaMate2 hierarchical agentic framework no independent evidence
purpose: To enable registration of expert-defined Python functions as AI-callable tools for custom scientific workflows
The framework and its Tool Registration Protocol are introduced as the primary contribution.

pith-pipeline@v0.9.0 · 5832 in / 1301 out tokens · 56857 ms · 2026-05-21T02:46:09.127877+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

8 extracted references · 8 canonical work pages

[1]

Bran, A.; Cox, S.; Schilter, O.; Baldassari, C.; White, A

(1) M. Bran, A.; Cox, S.; Schilter, O.; Baldassari, C.; White, A. D.; Schwaller, P. Augmenting Large Language Models with Chemistry Tools. Nat. Mach. Intell. 2024, 6 (5), 525–535. https://doi.org/10.1038/s42256-024-00832-8. (2) McNaughton, A. D.; Sankar Ramalaxmi, G. K.; Kruel, A.; Knutson, C. R.; Varikoti, R. A.; Kumar, N. CACTUS: Chemistry Agent Connect...

work page doi:10.1038/s42256-024-00832-8 2024
[2]

H.; Aldossary, A.; Bai, J.; Leong, S

(4) Zou, Y.; Cheng, A. H.; Aldossary, A.; Bai, J.; Leong, S. X.; Campos-Gonzalez-Angulo, J. A.; Choi, C.; Ser, C. T.; Tom, G.; Wang, A.; Zhang, Z.; Yakavets, I.; Hao, H.; Crebolder, C.; Bernales, V.; Aspuru-Guzik, A. El Agente: An Autonomous Agent for Quantum Chemistry. Matter 2025, 8 (7). https://doi.org/10.1016/j.matt.2025.102263. (5) Campbell, Q.; Cox,...

work page doi:10.1016/j.matt.2025.102263 2025
[3]

C.; Collison, C

(6) Ramos, M. C.; Collison, C. J.; White, A. D. A Review of Large Language Models and Autonomous Agents in Chemistry. Chem. Sci. 2025, 16 (6), 2514–2572. https://doi.org/10.1039/D4SC03921A. (7) Zhang, Z.; Yin, A.; Baweja, A.; Bai, J.; Gustin, I.; Bernales, V.; Aspuru-Guzik, A. El Agente Forjador: Task-Driven Agent Generation for Quantum Simulation

work page doi:10.1039/d4sc03921a 2025
[4]

ChatMOF: An Artificial Intelligence System for Predicting and Generating Metal-Organic Frameworks Using Large Language Models

(9) Kang, Y.; Kim, J. ChatMOF: An Artificial Intelligence System for Predicting and Generating Metal-Organic Frameworks Using Large Language Models. Nat. Commun. 2024, 15 (1),

work page 2024
[5]

(10) Jiang, G.; Luo, Q

https://doi.org/10.1038/s41467-024-48998-4. (10) Jiang, G.; Luo, Q. Chemis{TRAG}: Table-Based Retrieval-Augmented Generation for Chemistry Question Answering

work page doi:10.1038/s41467-024-48998-4
[6]

Rag2Mol: Structure-Based Drug Design Based on Retrieval Augmented Generation

(11) Zhang, P.; Peng, X.; Han, R.; Chen, T.; Ma, J. Rag2Mol: Structure-Based Drug Design Based on Retrieval Augmented Generation. Brief. Bioinform. 2025, 26 (3), bbaf265. https://doi.org/10.1093/bib/bbaf265. (12) Kreimeyer, K.; Canzoniero, J. V; Fatteh, M.; Anagnostou, V.; Botsis, T. Using Retrieval- Augmented Generation to Capture Molecularly-Driven Trea...

work page doi:10.1093/bib/bbaf265 2025
[7]

A.; MacKnight, R.; Kline, B.; Gomes, G

(21) Boiko, D. A.; MacKnight, R.; Kline, B.; Gomes, G. Autonomous Chemical Research with Large Language Models. Nature 2023, 624 (7992), 570–578. https://doi.org/10.1038/s41586-023-06792-

work page doi:10.1038/s41586-023-06792- 2023
[8]

(23) Ghafarollahi, A.; Buehler, M. J. Automating Alloy Design and Discovery with Physics-Aware Multimodal Multiagent AI. Proc. Natl. Acad. Sci. 2025, 122 (4), e2414074122. https://doi.org/10.1073/pnas.2414074122

work page doi:10.1073/pnas.2414074122 2025

[1] [1]

Bran, A.; Cox, S.; Schilter, O.; Baldassari, C.; White, A

(1) M. Bran, A.; Cox, S.; Schilter, O.; Baldassari, C.; White, A. D.; Schwaller, P. Augmenting Large Language Models with Chemistry Tools. Nat. Mach. Intell. 2024, 6 (5), 525–535. https://doi.org/10.1038/s42256-024-00832-8. (2) McNaughton, A. D.; Sankar Ramalaxmi, G. K.; Kruel, A.; Knutson, C. R.; Varikoti, R. A.; Kumar, N. CACTUS: Chemistry Agent Connect...

work page doi:10.1038/s42256-024-00832-8 2024

[2] [2]

H.; Aldossary, A.; Bai, J.; Leong, S

(4) Zou, Y.; Cheng, A. H.; Aldossary, A.; Bai, J.; Leong, S. X.; Campos-Gonzalez-Angulo, J. A.; Choi, C.; Ser, C. T.; Tom, G.; Wang, A.; Zhang, Z.; Yakavets, I.; Hao, H.; Crebolder, C.; Bernales, V.; Aspuru-Guzik, A. El Agente: An Autonomous Agent for Quantum Chemistry. Matter 2025, 8 (7). https://doi.org/10.1016/j.matt.2025.102263. (5) Campbell, Q.; Cox,...

work page doi:10.1016/j.matt.2025.102263 2025

[3] [3]

C.; Collison, C

(6) Ramos, M. C.; Collison, C. J.; White, A. D. A Review of Large Language Models and Autonomous Agents in Chemistry. Chem. Sci. 2025, 16 (6), 2514–2572. https://doi.org/10.1039/D4SC03921A. (7) Zhang, Z.; Yin, A.; Baweja, A.; Bai, J.; Gustin, I.; Bernales, V.; Aspuru-Guzik, A. El Agente Forjador: Task-Driven Agent Generation for Quantum Simulation

work page doi:10.1039/d4sc03921a 2025

[4] [4]

ChatMOF: An Artificial Intelligence System for Predicting and Generating Metal-Organic Frameworks Using Large Language Models

(9) Kang, Y.; Kim, J. ChatMOF: An Artificial Intelligence System for Predicting and Generating Metal-Organic Frameworks Using Large Language Models. Nat. Commun. 2024, 15 (1),

work page 2024

[5] [5]

(10) Jiang, G.; Luo, Q

https://doi.org/10.1038/s41467-024-48998-4. (10) Jiang, G.; Luo, Q. Chemis{TRAG}: Table-Based Retrieval-Augmented Generation for Chemistry Question Answering

work page doi:10.1038/s41467-024-48998-4

[6] [6]

Rag2Mol: Structure-Based Drug Design Based on Retrieval Augmented Generation

(11) Zhang, P.; Peng, X.; Han, R.; Chen, T.; Ma, J. Rag2Mol: Structure-Based Drug Design Based on Retrieval Augmented Generation. Brief. Bioinform. 2025, 26 (3), bbaf265. https://doi.org/10.1093/bib/bbaf265. (12) Kreimeyer, K.; Canzoniero, J. V; Fatteh, M.; Anagnostou, V.; Botsis, T. Using Retrieval- Augmented Generation to Capture Molecularly-Driven Trea...

work page doi:10.1093/bib/bbaf265 2025

[7] [7]

A.; MacKnight, R.; Kline, B.; Gomes, G

(21) Boiko, D. A.; MacKnight, R.; Kline, B.; Gomes, G. Autonomous Chemical Research with Large Language Models. Nature 2023, 624 (7992), 570–578. https://doi.org/10.1038/s41586-023-06792-

work page doi:10.1038/s41586-023-06792- 2023

[8] [8]

(23) Ghafarollahi, A.; Buehler, M. J. Automating Alloy Design and Discovery with Physics-Aware Multimodal Multiagent AI. Proc. Natl. Acad. Sci. 2025, 122 (4), e2414074122. https://doi.org/10.1073/pnas.2414074122

work page doi:10.1073/pnas.2414074122 2025