The Ideation Bottleneck: Decomposing the Quality Gap Between AI-Generated and Human Economics Research

· 2026 · econ.GN · arXiv 2604.03338

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Autonomous AI systems can now generate complete economics research papers, but they substantially underperform human-authored publications in head-to-head comparisons. This paper decomposes the quality gap into two independent components: research idea quality and execution quality. Using a two-model ensemble of fine-tuned language models trained on publication decisions (Gong, Li, and Zhou, 2026) to evaluate idea quality and a comprehensive six-dimension rubric assessed by Gemini 3.1 Flash Lite -- the same model family used as the APE tournament judge, ensuring methodological consistency -- to evaluate execution quality, we analyze 953 economics papers -- 912 AI-generated papers from the APE project and 41 human papers published in the American Economic Review and AEJ: Economic Policy. The idea quality gap is large (Cohen's d = 2.23, p < 0.001), with human papers achieving 47.1% mean ensemble exceptional probability versus 16.5% for AI. The execution quality gap is also significant but smaller (d = 0.90, p < 0.001), with human papers scoring 4.38/5.0 versus 3.84. Idea quality accounts for approximately 71% of the overall quality difference, with execution contributing 29%. The largest execution weakness is mechanism analysis depth (d = 1.43); no significant difference is found on robustness. We document that 74% of AI papers employ difference-in-differences, and only 7 AI papers (0.8%) surpass the median human paper on both idea and execution quality simultaneously. The primary bottleneck to competitive AI-generated economics research remains ideation.

representative citing papers

Merit or networks? What decides where research is published

econ.GN · 2026-06-02 · unverdicted · novelty 6.0

LLM-based pre-publication idea quality scoring on 6208 economics papers shows execution sets a meritocratic floor, idea quality grades intermediate rungs, and connections provide a bounded advantage mainly at top journals.

citing papers explorer

Showing 1 of 1 citing paper.

Merit or networks? What decides where research is published econ.GN · 2026-06-02 · unverdicted · none · ref 25 · internal anchor
LLM-based pre-publication idea quality scoring on 6208 economics papers shows execution sets a meritocratic floor, idea quality grades intermediate rungs, and connections provide a bounded advantage mainly at top journals.

The Ideation Bottleneck: Decomposing the Quality Gap Between AI-Generated and Human Economics Research

fields

years

verdicts

representative citing papers

citing papers explorer