{"paper":{"title":"ChemCrow: Augmenting large-language models with chemistry tools","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"An LLM agent augmented with 18 chemistry tools autonomously plans and executes real syntheses.","cross_cats":["stat.ML"],"primary_cat":"physics.chem-ph","authors_text":"Andres M Bran, Andrew D White, Carlo Baldassari, Oliver Schilter, Philippe Schwaller, Sam Cox","submitted_at":"2023-04-11T17:41:13Z","abstract_excerpt":"Over the last decades, excellent computational chemistry tools have been developed. Integrating them into a single platform with enhanced accessibility could help reaching their full potential by overcoming steep learning curves. Recently, large-language models (LLMs) have shown strong performance in tasks across domains, but struggle with chemistry-related problems. Moreover, these models lack access to external knowledge sources, limiting their usefulness in scientific applications. In this study, we introduce ChemCrow, an LLM chemistry agent designed to accomplish tasks across organic synth"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Our agent autonomously planned and executed the syntheses of an insect repellent, three organocatalysts, and guided the discovery of a novel chromophore. Our evaluation, including both LLM and expert assessments, demonstrates ChemCrow's effectiveness in automating a diverse set of chemical tasks.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the base large language model can reliably interpret tool outputs, avoid hallucinated chemistry, and produce valid multi-step plans without human correction or post-hoc filtering.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"ChemCrow augments LLMs with 18 expert chemistry tools to autonomously plan and execute syntheses and guide molecular discoveries in organic synthesis, drug discovery, and materials design.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"An LLM agent augmented with 18 chemistry tools autonomously plans and executes real syntheses.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"ade9d878fb72db3660108804e39a41b575c250ab665543a842662d956b8e76ae"},"source":{"id":"2304.05376","kind":"arxiv","version":5},"verdict":{"id":"ebd84408-7e96-4040-9ce7-126c9d56326d","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T19:01:47.404607Z","strongest_claim":"Our agent autonomously planned and executed the syntheses of an insect repellent, three organocatalysts, and guided the discovery of a novel chromophore. Our evaluation, including both LLM and expert assessments, demonstrates ChemCrow's effectiveness in automating a diverse set of chemical tasks.","one_line_summary":"ChemCrow augments LLMs with 18 expert chemistry tools to autonomously plan and execute syntheses and guide molecular discoveries in organic synthesis, drug discovery, and materials design.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the base large language model can reliably interpret tool outputs, avoid hallucinated chemistry, and produce valid multi-step plans without human correction or post-hoc filtering.","pith_extraction_headline":"An LLM agent augmented with 18 chemistry tools autonomously plans and executes real syntheses."},"references":{"count":118,"sample":[{"doi":"","year":2018,"title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding","work_id":"ed240a10-5b19-406c-baa5-30803f465785","ref_index":1,"cited_arxiv_id":"1810.04805","is_internal_anchor":true},{"doi":"","year":2020,"title":"D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A., et al","work_id":"013cd642-ff3f-4b54-8bb1-47e827bbd8ec","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2021,"title":"On the Opportunities and Risks of Foundation Models","work_id":"a18039e9-928d-47c9-a836-32656a71bf71","ref_index":3,"cited_arxiv_id":"2108.07258","is_internal_anchor":true},{"doi":"","year":2022,"title":"PaLM: Scaling Language Modeling with Pathways","work_id":"a94f3ef7-2c49-4445-93fe-6ec16aafd966","ref_index":4,"cited_arxiv_id":"2204.02311","is_internal_anchor":true},{"doi":"","year":2023,"title":"Sparks of Artificial General Intelligence: Early experiments with GPT-4","work_id":"a23cfe92-7f7c-424b-98d4-b386a83002fb","ref_index":5,"cited_arxiv_id":"2303.12712","is_internal_anchor":true}],"resolved_work":118,"snapshot_sha256":"a22d6927e3754deb45688b76416f3fc84fd4eb0ac1af6ee8f61668c1e674bf03","internal_anchors":13},"formal_canon":{"evidence_count":2,"snapshot_sha256":"afccc77099260e7bc07f3ed9197598956037d0c4babe2c66da342e79691215a5"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}