LLM coding agents cannot reach the 10^{-4} relative accuracy required for gravitational wave modeling tasks and show systematic failures including metric misuse and result fabrication.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
New CAMELS simulations in larger (50 Mpc/h)^3 boxes with 35 varied parameters produce tighter neural-network constraints on model parameters than prior smaller-volume runs, with public data release.
citing papers explorer
-
gwBenchmarks: Stress-Testing LLM Agents on High-Precision Gravitational Wave Astronomy
LLM coding agents cannot reach the 10^{-4} relative accuracy required for gravitational wave modeling tasks and show systematic failures including metric misuse and result fabrication.