hub

Towards scientific intelligence: A survey of llm-based scientific agents

Shuo Ren, Pu Jian, Zhenjiang Ren, Chunlin Leng, Can Xie, Jiajun Zhang · 2025 · arXiv 2503.24047

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

read on arXiv browse 13 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 dataset 1

citation-polarity summary

background 3 use dataset 1

representative citing papers

PDEAgent-Bench: A Multi-Metric, Multi-Library Benchmark for PDE Solver Generation

cs.AI · 2026-05-10 · unverdicted · novelty 8.0

PDEAgent-Bench is the first multi-metric, multi-library benchmark for AI-generated PDE solvers, evaluating executability, numerical accuracy, and efficiency across DOLFINx, Firedrake, and deal.II.

ReplicatorBench: Benchmarking LLM Agents for Replicability in Social and Behavioral Sciences

cs.AI · 2026-02-11 · accept · novelty 8.0

ReplicatorBench evaluates LLM agents on replicating social and behavioral science claims across retrieval, computation, and interpretation stages, finding strength in experiment execution but weakness in resource retrieval.

Luminol-AIDetect: Fast Zero-shot Machine-Generated Text Detection based on Perplexity under Text Shuffling

cs.CL · 2026-04-28 · unverdicted · novelty 7.0

Luminol-AIDetect detects machine-generated text zero-shot by extracting perplexity-based features from original and shuffled text versions, using density estimation and ensemble prediction to exploit greater structural fragility in AI output.

Weak-Link Optimization for Multi-Agent Reasoning and Collaboration

cs.AI · 2026-04-17 · unverdicted · novelty 7.0

WORC improves multi-agent LLM reasoning to 82.2% average accuracy by predicting and compensating for the weakest agent via targeted extra sampling rather than uniform reinforcement.

AlphaEvolve: A coding agent for scientific and algorithmic discovery

cs.AI · 2025-06-16 · unverdicted · novelty 7.0

AlphaEvolve is an LLM-orchestrated evolutionary coding agent that discovered a 4x4 complex matrix multiplication algorithm using 48 scalar multiplications, the first improvement over Strassen's algorithm in 56 years, plus optimizations for Google data centers and hardware.

OpenAaaS: An Open Agent-as-a-Service Framework for Distributed Materials-Informatics Research

cond-mat.mtrl-sci · 2026-05-13 · unverdicted · novelty 6.0

OpenAaaS is a hierarchical agent-as-a-service system that enables secure multi-agent collaboration for materials informatics by moving code to data rather than data to code.

Position: Academic Conferences are Potentially Facing Denominator Gaming Caused by Fully Automated Scientific Agents

cs.CL · 2026-05-11 · unverdicted · novelty 6.0

Malicious actors could use AI agents to submit large numbers of fake papers, inflating the submission count and thereby raising the acceptance odds for a small set of chosen legitimate papers under stable conference acceptance rates.

From Research Question to Scientific Workflow: Leveraging Agentic AI for Science Automation

cs.AI · 2026-04-23 · conditional · novelty 6.0

Agentic architecture automates research-question-to-workflow translation via LLM intent extraction, deterministic generators, and Skills, raising intent accuracy from 44% to 83% and cutting data transfer by 92% in a 1000 Genomes evaluation.

DORA Explorer: Improving the Exploration Ability of LLMs Without Training

cs.CL · 2026-04-19 · unverdicted · novelty 5.0

DORA Explorer boosts LLM agent exploration without training by ranking diverse actions using log-probabilities and a tunable parameter, yielding UCB-competitive results on multi-armed bandits and gains on text adventure environments.

TurboAgent: An LLM-Driven Autonomous Multi-Agent Framework for Turbomachinery Aerodynamic Design

cs.AI · 2026-04-08 · unverdicted · novelty 5.0

TurboAgent uses an LLM as coordinator for specialized agents to autonomously generate, predict, optimize, and validate turbomachinery designs, achieving R² > 0.91 agreement with CFD on a transonic compressor and 1.61% efficiency gains.

Evolving Roles of LLMs in Scientific Innovation: Assistant, Collaborator, Scientist, and Evaluator

cs.DL · 2025-07-16 · unverdicted · novelty 4.0

The paper proposes a four-role framework for LLMs in scientific innovation and reviews methods, benchmarks, and limitations across Assistant, Collaborator, Scientist, and Evaluator roles.

URSA: The Universal Research and Scientific Agent

cs.AI · 2025-06-27 · unverdicted · novelty 4.0

URSA is a modular agent ecosystem that uses LLMs and scientific tools to accelerate research tasks of varying complexity.

Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live

cs.OS · 2025-11-04

citing papers explorer

Showing 13 of 13 citing papers.

PDEAgent-Bench: A Multi-Metric, Multi-Library Benchmark for PDE Solver Generation cs.AI · 2026-05-10 · unverdicted · none · ref 39
PDEAgent-Bench is the first multi-metric, multi-library benchmark for AI-generated PDE solvers, evaluating executability, numerical accuracy, and efficiency across DOLFINx, Firedrake, and deal.II.
ReplicatorBench: Benchmarking LLM Agents for Replicability in Social and Behavioral Sciences cs.AI · 2026-02-11 · accept · none · ref 13
ReplicatorBench evaluates LLM agents on replicating social and behavioral science claims across retrieval, computation, and interpretation stages, finding strength in experiment execution but weakness in resource retrieval.
Luminol-AIDetect: Fast Zero-shot Machine-Generated Text Detection based on Perplexity under Text Shuffling cs.CL · 2026-04-28 · unverdicted · none · ref 25
Luminol-AIDetect detects machine-generated text zero-shot by extracting perplexity-based features from original and shuffled text versions, using density estimation and ensemble prediction to exploit greater structural fragility in AI output.
Weak-Link Optimization for Multi-Agent Reasoning and Collaboration cs.AI · 2026-04-17 · unverdicted · none · ref 12
WORC improves multi-agent LLM reasoning to 82.2% average accuracy by predicting and compensating for the weakest agent via targeted extra sampling rather than uniform reinforcement.
AlphaEvolve: A coding agent for scientific and algorithmic discovery cs.AI · 2025-06-16 · unverdicted · none · ref 82
AlphaEvolve is an LLM-orchestrated evolutionary coding agent that discovered a 4x4 complex matrix multiplication algorithm using 48 scalar multiplications, the first improvement over Strassen's algorithm in 56 years, plus optimizations for Google data centers and hardware.
OpenAaaS: An Open Agent-as-a-Service Framework for Distributed Materials-Informatics Research cond-mat.mtrl-sci · 2026-05-13 · unverdicted · none · ref 34
OpenAaaS is a hierarchical agent-as-a-service system that enables secure multi-agent collaboration for materials informatics by moving code to data rather than data to code.
Position: Academic Conferences are Potentially Facing Denominator Gaming Caused by Fully Automated Scientific Agents cs.CL · 2026-05-11 · unverdicted · none · ref 20
Malicious actors could use AI agents to submit large numbers of fake papers, inflating the submission count and thereby raising the acceptance odds for a small set of chosen legitimate papers under stable conference acceptance rates.
From Research Question to Scientific Workflow: Leveraging Agentic AI for Science Automation cs.AI · 2026-04-23 · conditional · none · ref 9
Agentic architecture automates research-question-to-workflow translation via LLM intent extraction, deterministic generators, and Skills, raising intent accuracy from 44% to 83% and cutting data transfer by 92% in a 1000 Genomes evaluation.
DORA Explorer: Improving the Exploration Ability of LLMs Without Training cs.CL · 2026-04-19 · unverdicted · none · ref 1
DORA Explorer boosts LLM agent exploration without training by ranking diverse actions using log-probabilities and a tunable parameter, yielding UCB-competitive results on multi-armed bandits and gains on text adventure environments.
TurboAgent: An LLM-Driven Autonomous Multi-Agent Framework for Turbomachinery Aerodynamic Design cs.AI · 2026-04-08 · unverdicted · none · ref 22
TurboAgent uses an LLM as coordinator for specialized agents to autonomously generate, predict, optimize, and validate turbomachinery designs, achieving R² > 0.91 agreement with CFD on a transonic compressor and 1.61% efficiency gains.
Evolving Roles of LLMs in Scientific Innovation: Assistant, Collaborator, Scientist, and Evaluator cs.DL · 2025-07-16 · unverdicted · none · ref 148
The paper proposes a four-role framework for LLMs in scientific innovation and reviews methods, benchmarks, and limitations across Assistant, Collaborator, Scientist, and Evaluator roles.
URSA: The Universal Research and Scientific Agent cs.AI · 2025-06-27 · unverdicted · none · ref 16
URSA is a modular agent ecosystem that uses LLMs and scientific tools to accelerate research tasks of varying complexity.
Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live cs.OS · 2025-11-04 · unreviewed · ref 61

Towards scientific intelligence: A survey of llm-based scientific agents

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer