hub

Robin: A multi-agent system for automating scientific discovery

Ali Essam Ghareeb, Benjamin Chang, Ludovico Mitchener, Angela Yiu, Caralyn J · 2025 · arXiv 2505.13400

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

read on arXiv browse 13 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

baseline 2 background 1

citation-polarity summary

baseline 2 background 1

representative citing papers

TO-Agents: A Multi-Agent AI Pipeline for Preference-Guided Topology Optimization

cs.AI · 2026-05-20 · unverdicted · novelty 7.0

A multi-agent pipeline iteratively refines topology optimization outputs to match natural language preferences for branched structures, achieving 60% success rate across replicates in cantilever and phone-stand tasks.

AgentForesight: Online Auditing for Early Failure Prediction in Multi-Agent Systems

cs.CL · 2026-05-09 · unverdicted · novelty 7.0 · 2 refs

AgentForesight introduces an online auditor model that predicts decisive errors in multi-agent trajectories at the earliest step using a coarse-to-fine reinforcement learning recipe on a new curated dataset AFTraj-2K.

RosettaSearch: Multi-Objective Inference-Time Search for Protein Sequence Design

cs.LG · 2026-04-19 · unverdicted · novelty 7.0

RosettaSearch applies LLM-driven multi-objective search at inference time to improve backbone-conditioned protein sequences, recovering designs with 18-68% better structural fidelity and 2.5x higher success rates than single-pass models like LigandMPNN.

Kosmos: An AI Scientist for Autonomous Discovery

cs.AI · 2025-11-04 · unverdicted · novelty 7.0

Kosmos is an AI scientist that maintains coherence over hundreds of agent steps via a shared world model, executes thousands of code lines and reads thousands of papers per run, and produces traceable reports with 79.4% statement accuracy according to independent reviewers.

Autonomous Scientific Discovery via Iterative Meta-Reflection

cs.CV · 2026-07-01 · unverdicted · novelty 6.0

DiscoPER uses code generation, statistical validation, and second-order meta-reflection on accumulated discoveries to recover 8 of 9 known ecological patterns on a new benchmark at 72.7% support rate.

Testing Frontier Large Language Models' Physics Literacy in Parallel Physical Worlds

cs.LG · 2026-06-30 · unverdicted · novelty 6.0

Introduces an auditable four-stage diagnostic for LLM physics reasoning in novel frameworks and applies it to three parallel worlds, yielding pass rates of 6/15, 6/15, and 0/15 on frontier models with noted qualitative-quantitative asymmetry.

AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation

cs.AI · 2026-05-27 · unverdicted · novelty 6.0

Decentralized AI agent teams self-organize around hypotheses, critique proposals, and share knowledge to outperform single-agent baselines on biomedical ML, language-model optimization, and protein fitness tasks.

EvoMemBench: Benchmarking Agent Memory from a Self-Evolving Perspective

cs.CL · 2026-05-18 · unverdicted · novelty 6.0

EvoMemBench evaluates 15 memory methods for LLM agents and finds long-context baselines competitive with no single memory approach working consistently across settings.

PRL-Bench: A Comprehensive Benchmark Evaluating LLMs' Capabilities in Frontier Physics Research

cs.LG · 2026-04-16 · unverdicted · novelty 6.0

PRL-Bench evaluates frontier LLMs on 100 real physics research tasks and finds the best models score below 50, exposing a gap to autonomous discovery.

DeepReviewer 2.0: A Traceable Agentic System for Auditable Scientific Peer Review

cs.AI · 2026-03-03 · unverdicted · novelty 6.0

An agentic system produces traceable review packages and an un-finetuned 196B model using it covers more major issues than Gemini-3.1-Pro on 134 ICLR 2025 submissions while winning most blind comparisons to human committees.

Sibyl-AutoResearch: Autonomous Research Needs Self-Evolving Trial-and-Error Harnesses, Not Paper Generators

cs.MA · 2026-05-21 · unverdicted · novelty 5.0

Sibyl-AutoResearch introduces self-evolving trial-and-error harnesses with auditable conversion units that link trial signals to updated research behaviors and harness repairs in autonomous systems.

Claw AI Lab: An Autonomous Multi-Agent Research Team

cs.AI · 2026-05-21 · unverdicted · novelty 4.0

Claw AI Lab presents an interactive multi-agent platform for autonomous AI research that supports customizable teams, real-time control, and a code harness for experiment integration and result integrity.

VERITAS: A Multi-Agent Co-Scientist for Verifiable Image-Derived Hypothesis Testing

cs.MA · 2026-04-13

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

Robin: A multi-agent system for automating scientific discovery

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer