TianJi-Environ is a WRF-Chem-based multi-agent AI framework for autonomous validation of atmospheric chemistry mechanisms through executable experiments and evidence assessment.
Sciagents: Automating scientific discovery through bioinspired multi-agent intelligent graph reasoning
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
DN-Hypo-Pipeline operationalizes three philosophy-of-science accounts to direct LLMs toward principle-based hypothesis generation, claims superior performance over direct prompting, and derives two new transformer algorithms from the resulting hypotheses.
A new filtration-based conformal prediction method attributes errors in multi-agent systems by producing contiguous sequence sets with finite-sample coverage guarantees, enabling rollback recovery.
ChargeBD constructs a 500-question ESS-LLM Benchmark from 50 RFB tasks and evaluates 16 MBTI-inspired persona agents on DeepSeek-V3-Plus to create capability and cognitive advantage matrices for guided battery engineering.
A category-theoretic model frames scientific discovery as verified regime transitions via left Kan extensions that preserve and compare artifacts across schema changes in agentic AI.
Compass is an expert-guided LLM agent framework that extracts 3,751 marine Pb records from 230k papers to build the largest integrated database, achieving 92% accuracy via multi-layered validation.
Coordinated AI agents improve scientific inference from partial evidence in cross-domain tasks when single sources are incomplete, as demonstrated by AUROC gains in vector-borne disease and exoplanet benchmarks but tied performance in others.
The paper proposes a four-role framework for LLMs in scientific innovation and reviews methods, benchmarks, and limitations across Assistant, Collaborator, Scientist, and Evaluator roles.
A survey that deconstructs LLM agent systems via a methodology-centered taxonomy linking design principles to emergent behaviors, applications, and challenges.
citing papers explorer
-
TianJi-Environ: An Autonomous AI Scientist for Atmospheric Environmental Research
TianJi-Environ is a WRF-Chem-based multi-agent AI framework for autonomous validation of atmospheric chemistry mechanisms through executable experiments and evidence assessment.
-
DN-Hypo-Pipeline: An AI-Driven Workflow for Generating Hypotheses using Large Language Models and Scientific Explanations
DN-Hypo-Pipeline operationalizes three philosophy-of-science accounts to direct LLMs toward principle-based hypothesis generation, claims superior performance over direct prompting, and derives two new transformer algorithms from the resulting hypotheses.
-
Conformal Agent Error Attribution
A new filtration-based conformal prediction method attributes errors in multi-agent systems by producing contiguous sequence sets with finite-sample coverage guarantees, enabling rollback recovery.
-
ChargeBD: Character-Aware Heterogeneous Agent Reasoning for Guided Engineering in Battery Development
ChargeBD constructs a 500-question ESS-LLM Benchmark from 50 RFB tasks and evaluates 16 MBTI-inspired persona agents on DeepSeek-V3-Plus to create capability and cognitive advantage matrices for guided battery engineering.
-
Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial Intelligence
A category-theoretic model frames scientific discovery as verified regime transitions via left Kan extensions that preserve and compare artifacts across schema changes in agentic AI.
-
Compass: Navigating Global Marine Lead Data Integration through Expert-Guided LLM Agent
Compass is an expert-guided LLM agent framework that extracts 3,751 marine Pb records from 230k papers to build the largest integrated database, achieving 92% accuracy via multi-layered validation.
-
Cross-domain benchmarks reveal when coordinated AI agents improve scientific inference from partial evidence
Coordinated AI agents improve scientific inference from partial evidence in cross-domain tasks when single sources are incomplete, as demonstrated by AUROC gains in vector-borne disease and exoplanet benchmarks but tied performance in others.