{"work":{"id":"fa04f346-ee20-4e9d-bf04-3ad3569a8ed1","openalex_id":null,"doi":null,"arxiv_id":"2504.08066","raw_key":null,"title":"The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search","authors":null,"authors_text":"Yutaro Yamada, Robert Tjarko Lange, Cong Lu, Shengran Hu, Chris Lu, Jakob Foerster","year":2025,"venue":"cs.AI","abstract":"AI is increasingly playing a pivotal role in transforming how scientific discoveries are made. We introduce The AI Scientist-v2, an end-to-end agentic system capable of producing the first entirely AI generated peer-review-accepted workshop paper. This system iteratively formulates scientific hypotheses, designs and executes experiments, analyzes and visualizes data, and autonomously authors scientific manuscripts. Compared to its predecessor (v1, Lu et al., 2024 arXiv:2408.06292), The AI Scientist-v2 eliminates the reliance on human-authored code templates, generalizes effectively across diverse machine learning domains, and leverages a novel progressive agentic tree-search methodology managed by a dedicated experiment manager agent. Additionally, we enhance the AI reviewer component by integrating a Vision-Language Model (VLM) feedback loop for iterative refinement of content and aesthetics of the figures. We evaluated The AI Scientist-v2 by submitting three fully autonomous manuscripts to a peer-reviewed ICLR workshop. Notably, one manuscript achieved high enough scores to exceed the average human acceptance threshold, marking the first instance of a fully AI-generated paper successfully navigating a peer review. This accomplishment highlights the growing capability of AI in conducting all aspects of scientific research. We anticipate that further advancements in autonomous scientific discovery technologies will profoundly impact human knowledge generation, enabling unprecedented scalability in research productivity and significantly accelerating scientific breakthroughs, greatly benefiting society at large. We have open-sourced the code at https://github.com/SakanaAI/AI-Scientist-v2 to foster the future development of this transformative technology. We also discuss the role of AI in science, including AI safety.","external_url":"https://arxiv.org/abs/2504.08066","cited_by_count":null,"metadata_source":"pith","metadata_fetched_at":"2026-05-25T05:46:39.339949+00:00","pith_arxiv_id":"2504.08066","created_at":"2026-05-08T19:14:04.117985+00:00","updated_at":"2026-05-25T05:46:39.339949+00:00","title_quality_ok":true,"display_title":"The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search","render_title":"The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search"},"hub":{"state":{"work_id":"fa04f346-ee20-4e9d-bf04-3ad3569a8ed1","tier":"hub","tier_reason":"10+ Pith inbound or 1,000+ external citations","pith_inbound_count":62,"external_cited_by_count":null,"distinct_field_count":12,"first_pith_cited_at":"2025-05-16T15:02:19+00:00","last_pith_cited_at":"2026-05-22T03:40:30+00:00","author_build_status":"not_needed","summary_status":"needed","contexts_status":"needed","graph_status":"needed","ask_index_status":"not_needed","reader_status":"not_needed","recognition_status":"not_needed","updated_at":"2026-05-25T05:55:28.665834+00:00","tier_text":"hub"},"tier":"hub","role_counts":[{"context_role":"background","n":14},{"context_role":"dataset","n":1},{"context_role":"other","n":1}],"polarity_counts":[{"context_polarity":"background","n":12},{"context_polarity":"unclear","n":2},{"context_polarity":"support","n":1},{"context_polarity":"use_dataset","n":1}],"runs":{"context_extract":{"job_type":"context_extract","status":"succeeded","result":{"enqueued_papers":25},"error":null,"updated_at":"2026-05-14T18:19:33.108256+00:00"},"graph_features":{"job_type":"graph_features","status":"succeeded","result":{"co_cited":[{"title":"The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery","work_id":"56b6b58d-e73a-4317-896e-36ac5f84e957","shared_citers":20},{"title":"Towards an AI co-scientist","work_id":"485486b1-a1a2-4cde-bdda-768930c403e6","shared_citers":11},{"title":"Ai-researcher: Autonomous scientific innovation","work_id":"3845f0f0-08d4-4650-b390-6bfdd269f79a","shared_citers":8},{"title":"AlphaEvolve: A coding agent for scientific and algorithmic discovery","work_id":"76a0f850-d490-4e4f-ab98-8d25df82cd23","shared_citers":7},{"title":"Qwen3 Technical Report","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","shared_citers":6},{"title":"Cycleresearcher: Improving automated research via automated review","work_id":"529c57f4-8402-4221-93f2-032f08d64085","shared_citers":4},{"title":"Deepscientist: Advancing frontier-pushing scientific findings progressively","work_id":"d68e01e1-6d49-4438-9cb6-114f102063a8","shared_citers":4},{"title":"DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models","work_id":"c5006563-f3ec-438a-9e35-b7b484f34828","shared_citers":4},{"title":"GPT-4 Technical Report","work_id":"b928e041-6991-4c08-8c81-0359e4097c7b","shared_citers":4},{"title":"Mle-bench: Evaluating machine learning agents on machine learning engineering","work_id":"a671e43f-ceab-49e7-adc3-473d802a97ca","shared_citers":4},{"title":"Mlr-bench: Evaluating ai agents on open-ended machine learning research","work_id":"5096c958-8775-4f54-ac19-c33b3bca2724","shared_citers":4},{"title":"2310.03302 , archivePrefix =","work_id":"5655b20a-fbb7-4a39-8605-d6e1d689895a","shared_citers":3},{"title":"Aide: Ai-driven exploration in the space of code","work_id":"22aa3d2a-9edd-44c4-b8b8-1442ea805e01","shared_citers":3},{"title":"Ale-bench: A benchmark for long-horizon objective-driven algorithm engineering","work_id":"8dd988c6-feec-4de9-b185-eb358e229675","shared_citers":3},{"title":"and Moor, M","work_id":"311ec4e1-01f8-4353-8d69-1013fa0ffab4","shared_citers":3},{"title":"Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers","work_id":"95c4070f-48fa-44f6-bed8-a4b874c54eac","shared_citers":3},{"title":"DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning","work_id":"e6b75ad5-2877-4168-97c8-710407094d20","shared_citers":3},{"title":"doi: 10.18653/v1/2025.findings-emnlp.320","work_id":"11e386dd-1235-4d81-ba06-dd0717cd145b","shared_citers":3},{"title":"Evaluating Large Language Models Trained on Code","work_id":"042493e9-b26f-4b4e-bbde-382072ca9b08","shared_citers":3},{"title":"Galactica: A Large Language Model for Science","work_id":"fddd3111-c69a-4453-8bf2-d3517e863145","shared_citers":3},{"title":"Internagent: When agent becomes the scientist–building closed-loop system from hypothesis to verification","work_id":"edf06200-2610-4531-94a8-2b9a80fd108c","shared_citers":3},{"title":"Landsness, Daniel L","work_id":"f21b1a4c-dda4-447b-a222-692f1ecf62dd","shared_citers":3},{"title":"Llama 2: Open Foundation and Fine-Tuned Chat Models","work_id":"68a5177f-d644-44c1-bd4f-4e5278c22f5d","shared_citers":3},{"title":"Openalex: A fully-open index of scholarly works, authors, venues, institutions, and concepts","work_id":"01569e5d-750c-4621-b5ec-9bec2aa7a6b4","shared_citers":3}],"time_series":[{"n":31,"year":2026}],"dependency_candidates":[]},"error":null,"updated_at":"2026-05-14T18:19:46.937298+00:00"},"identity_refresh":{"job_type":"identity_refresh","status":"succeeded","result":{"items":[{"title":"Qwen3 Technical Report","outcome":"unchanged","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","resolver":"local_arxiv","confidence":0.98,"old_work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e"}],"counts":{"fixed":0,"merged":0,"unchanged":1,"quarantined":0,"needs_external_resolution":0},"errors":[],"attempted":1},"error":null,"updated_at":"2026-05-14T18:19:25.318640+00:00"},"summary_claims":{"job_type":"summary_claims","status":"succeeded","result":{"title":"The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search","claims":[{"claim_text":"AI is increasingly playing a pivotal role in transforming how scientific discoveries are made. We introduce The AI Scientist-v2, an end-to-end agentic system capable of producing the first entirely AI generated peer-review-accepted workshop paper. This system iteratively formulates scientific hypotheses, designs and executes experiments, analyzes and visualizes data, and autonomously authors scientific manuscripts. Compared to its predecessor (v1, Lu et al., 2024 arXiv:2408.06292), The AI Scientist-v2 eliminates the reliance on human-authored code templates, generalizes effectively across dive","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search because it crossed a citation-hub threshold.","role_counts":[]},"error":null,"updated_at":"2026-05-14T18:19:46.863021+00:00"}},"summary":{"title":"The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search","claims":[{"claim_text":"AI is increasingly playing a pivotal role in transforming how scientific discoveries are made. We introduce The AI Scientist-v2, an end-to-end agentic system capable of producing the first entirely AI generated peer-review-accepted workshop paper. This system iteratively formulates scientific hypotheses, designs and executes experiments, analyzes and visualizes data, and autonomously authors scientific manuscripts. Compared to its predecessor (v1, Lu et al., 2024 arXiv:2408.06292), The AI Scientist-v2 eliminates the reliance on human-authored code templates, generalizes effectively across dive","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search because it crossed a citation-hub threshold.","role_counts":[]},"graph":{"co_cited":[{"title":"The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery","work_id":"56b6b58d-e73a-4317-896e-36ac5f84e957","shared_citers":20},{"title":"Towards an AI co-scientist","work_id":"485486b1-a1a2-4cde-bdda-768930c403e6","shared_citers":11},{"title":"Ai-researcher: Autonomous scientific innovation","work_id":"3845f0f0-08d4-4650-b390-6bfdd269f79a","shared_citers":8},{"title":"AlphaEvolve: A coding agent for scientific and algorithmic discovery","work_id":"76a0f850-d490-4e4f-ab98-8d25df82cd23","shared_citers":7},{"title":"Qwen3 Technical Report","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","shared_citers":6},{"title":"Cycleresearcher: Improving automated research via automated review","work_id":"529c57f4-8402-4221-93f2-032f08d64085","shared_citers":4},{"title":"Deepscientist: Advancing frontier-pushing scientific findings progressively","work_id":"d68e01e1-6d49-4438-9cb6-114f102063a8","shared_citers":4},{"title":"DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models","work_id":"c5006563-f3ec-438a-9e35-b7b484f34828","shared_citers":4},{"title":"GPT-4 Technical Report","work_id":"b928e041-6991-4c08-8c81-0359e4097c7b","shared_citers":4},{"title":"Mle-bench: Evaluating machine learning agents on machine learning engineering","work_id":"a671e43f-ceab-49e7-adc3-473d802a97ca","shared_citers":4},{"title":"Mlr-bench: Evaluating ai agents on open-ended machine learning research","work_id":"5096c958-8775-4f54-ac19-c33b3bca2724","shared_citers":4},{"title":"2310.03302 , archivePrefix =","work_id":"5655b20a-fbb7-4a39-8605-d6e1d689895a","shared_citers":3},{"title":"Aide: Ai-driven exploration in the space of code","work_id":"22aa3d2a-9edd-44c4-b8b8-1442ea805e01","shared_citers":3},{"title":"Ale-bench: A benchmark for long-horizon objective-driven algorithm engineering","work_id":"8dd988c6-feec-4de9-b185-eb358e229675","shared_citers":3},{"title":"and Moor, M","work_id":"311ec4e1-01f8-4353-8d69-1013fa0ffab4","shared_citers":3},{"title":"Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers","work_id":"95c4070f-48fa-44f6-bed8-a4b874c54eac","shared_citers":3},{"title":"DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning","work_id":"e6b75ad5-2877-4168-97c8-710407094d20","shared_citers":3},{"title":"doi: 10.18653/v1/2025.findings-emnlp.320","work_id":"11e386dd-1235-4d81-ba06-dd0717cd145b","shared_citers":3},{"title":"Evaluating Large Language Models Trained on Code","work_id":"042493e9-b26f-4b4e-bbde-382072ca9b08","shared_citers":3},{"title":"Galactica: A Large Language Model for Science","work_id":"fddd3111-c69a-4453-8bf2-d3517e863145","shared_citers":3},{"title":"Internagent: When agent becomes the scientist–building closed-loop system from hypothesis to verification","work_id":"edf06200-2610-4531-94a8-2b9a80fd108c","shared_citers":3},{"title":"Landsness, Daniel L","work_id":"f21b1a4c-dda4-447b-a222-692f1ecf62dd","shared_citers":3},{"title":"Llama 2: Open Foundation and Fine-Tuned Chat Models","work_id":"68a5177f-d644-44c1-bd4f-4e5278c22f5d","shared_citers":3},{"title":"Openalex: A fully-open index of scholarly works, authors, venues, institutions, and concepts","work_id":"01569e5d-750c-4621-b5ec-9bec2aa7a6b4","shared_citers":3}],"time_series":[{"n":31,"year":2026}],"dependency_candidates":[]},"authors":[]}}