pith. sign in

arxiv: 2509.06503 · v2 · pith:2QRJUZEZnew · submitted 2025-09-08 · 💻 cs.AI · q-bio.QM

An AI system to help scientists write expert-level empirical software

classification 💻 cs.AI q-bio.QM
keywords expert-levelsoftwarenovelscientificsystemanalysiscitediverse
0
0 comments X
read the original abstract

The cycle of scientific discovery is frequently bottlenecked by the slow, manual creation of software to support computational experiments\cite{hannay2009how}. To address this, we present Empirical Research Assistance (ERA), an AI system that creates expert-level scientific software whose goal is to maximize a quality metric. The system uses a Large Language Model (LLM) and Tree Search (TS)\cite{silver2016mastering} to systematically improve the quality metric and intelligently navigate the large space of possible solutions. ERA achieves expert-level results when it explores and integrates complex research ideas from external sources. The effectiveness of tree search is demonstrated across a diverse range of tasks. In bioinformatics, ERA discovered 40 novel methods for single-cell data analysis that outperformed the top human-developed methods on a public leaderboard. In epidemiology, ERA generated 14 models that outperformed the CDC ensemble and all other individual models for forecasting COVID-19 hospitalizations. ERA also produced expert-level software for geospatial analysis, neural activity prediction in zebrafish, and numerical solution of integrals, and a novel rule-based construction for time series forecasting. By devising and implementing novel solutions to diverse tasks, ERA represents a significant step towards accelerating scientific progress.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Prospective multi-pathogen disease forecasting using autonomous LLM-guided tree search

    cs.AI 2026-05 unverdicted novelty 7.0

    An LLM-guided tree search system autonomously creates diverse forecasting models that match or beat CDC human-curated ensembles in a 2025-2026 prospective multi-pathogen evaluation.

  2. Probabilistic Seasonal Streamflow Forecasting Across California's Sierra Nevada Watersheds with Agentic AI

    physics.ao-ph 2026-05 unverdicted novelty 7.0

    An agentic AI workflow evolves an adaptive XGBoost quantile regression ensemble that reduces watershed-averaged forecast error by up to 29% versus California's operational forecasts for April-July runoff at 1-6 month ...

  3. Optimized Three-Dimensional Photovoltaic Structures with LLM guided Tree Search

    cs.CL 2026-05 conditional novelty 6.0

    LLM-guided tree search with coding agents optimizes 3D photovoltaic designs for higher diurnal energy yield after correcting for simulation exploits.

  4. Glia: A Human-Inspired AI for Automated Systems Design and Optimization

    cs.AI 2025-10 unverdicted novelty 6.0

    Glia deploys a multi-agent LLM workflow with reasoning, experimentation, and analysis agents to generate interpretable algorithms for request routing, scheduling, and auto-scaling in distributed GPU clusters, reaching...

  5. ATHENA: Agentic Team for Hierarchical Evolutionary Numerical Algorithms

    cs.LG 2025-12 unverdicted novelty 5.0

    ATHENA introduces an agentic team framework that autonomously manages the end-to-end computational research lifecycle via a knowledge-driven HENA loop to achieve validation errors of 10^{-14} in scientific computing a...