LLM coding agents cannot reach the 10^{-4} relative accuracy required for gravitational wave modeling tasks and show systematic failures including metric misuse and result fabrication.
Villaescusa-Navarro, B
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 7roles
background 4polarities
background 4representative citing papers
Analogical reasoning increases LLM solution diversity by 90-173% and novelty rate to over 50%, delivering up to 13-fold gains on biomedical tasks including perturbation prediction and cell communication.
GRAFT-ATHENA projects combinatorial method choices into factored trees that embed as fingerprints in a metric space, enabling an agentic system to accumulate experience across domains and autonomously discover new numerical techniques for physics-informed problems.
In DHOST theories with Gauss-Bonnet and Weyl operators, gauge symmetry invariance conditions are identical to Hamiltonian constraints eliminating ghosts.
GWAgent agentic workflow produces analytic surrogates for eccentric BBH waveforms with 6.9e-4 median mismatch and 8.4x speedup, outperforming baselines, and infers eccentricity for GW200129.
CMBAgent achieves high accuracy on well-specified astrophysical tasks with context but generates silent, plausible-yet-incorrect outputs on reasoning-challenging problems, with no self-diagnosis of inconsistencies.
LARA-HPC introduces a validation-first agentic system with dry-run verification and multi-phase refinement that improves robustness of AI-generated DFT workflows on HPC systems.
citing papers explorer
-
gwBenchmarks: Stress-Testing LLM Agents on High-Precision Gravitational Wave Astronomy
LLM coding agents cannot reach the 10^{-4} relative accuracy required for gravitational wave modeling tasks and show systematic failures including metric misuse and result fabrication.
-
Unlocking LLM Creativity in Science through Analogical Reasoning
Analogical reasoning increases LLM solution diversity by 90-173% and novelty rate to over 50%, delivering up to 13-fold gains on biomedical tasks including perturbation prediction and cell communication.
-
GRAFT-ATHENA: Self-Improving Agentic Teams for Autonomous Discovery and Evolutionary Numerical Algorithms
GRAFT-ATHENA projects combinatorial method choices into factored trees that embed as fingerprints in a metric space, enabling an agentic system to accumulate experience across domains and autonomously discover new numerical techniques for physics-informed problems.
-
AI--Assisted Exploration: DHOST Theories without Quantum Ghosts
In DHOST theories with Gauss-Bonnet and Weyl operators, gauge symmetry invariance conditions are identical to Hamiltonian constraints eliminating ghosts.
-
Discovery of Interpretable Surrogates via Agentic AI: Application to Gravitational Waves
GWAgent agentic workflow produces analytic surrogates for eccentric BBH waveforms with 6.9e-4 median mismatch and 8.4x speedup, outperforming baselines, and infers eccentricity for GW200129.
-
Plausible but Wrong: A case study on Agentic Failures in Astrophysical Workflows
CMBAgent achieves high accuracy on well-specified astrophysical tasks with context but generates silent, plausible-yet-incorrect outputs on reasoning-challenging problems, with no self-diagnosis of inconsistencies.
-
LARA: Validation-Driven Agentic Supercomputer Workflows for Atomistic Modeling
LARA-HPC introduces a validation-first agentic system with dry-run verification and multi-phase refinement that improves robustness of AI-generated DFT workflows on HPC systems.