LLMs given only research questions from 1000 arXiv CS papers recommend a narrower set of methods than the original papers, with effective model-entity diversity dropping from 1232 to 59-96 and stronger agreement among LLMs than with papers.
Canonical reference
Villaescusa-Navarro, B
Canonical reference. 80% of citing Pith papers cite this work as background.
citation-role summary
citation-polarity summary
years
2026 9representative citing papers
LLM coding agents cannot reach the 10^{-4} relative accuracy required for gravitational wave modeling tasks and show systematic failures including metric misuse and result fabrication.
Analogical reasoning increases LLM solution diversity by 90-173% and novelty rate to over 50%, delivering up to 13-fold gains on biomedical tasks including perturbation prediction and cell communication.
GRAFT-ATHENA projects combinatorial method choices into factored trees that embed as fingerprints in a metric space, enabling an agentic system to accumulate experience across domains and autonomously discover new numerical techniques for physics-informed problems.
Case study of CMBAgent on 18 astrophysical tasks finds strong performance on well-specified problems but frequent silent failures yielding physically inconsistent outputs.
In DHOST theories with Gauss-Bonnet and Weyl operators, gauge symmetry invariance conditions are identical to Hamiltonian constraints eliminating ghosts.
GWAgent agentic workflow produces analytic surrogates for eccentric BBH waveforms with 6.9e-4 median mismatch and 8.4x speedup, outperforming baselines, and infers eccentricity for GW200129.
LARA-HPC introduces a validation-first agentic system with dry-run verification and multi-phase refinement that improves robustness of AI-generated DFT workflows on HPC systems.
Position paper arguing that multi-agent AI systems can become AI scientists and calling for reformed scientific institutions to support their development with emphasis on verification and dual-use safety.
citing papers explorer
No citing papers match the current filters.