Merit or networks? What decides where research is published
Pith reviewed 2026-06-28 07:51 UTC · model grok-4.3
The pith
Economics publishing follows a prestige ladder: execution quality sets the floor, idea quality grades the rungs, and connections set a ceiling only at the top journals.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using a text-based idea quality score from an LLM evaluator, the study finds that journal placement in economics follows a sequence along the prestige ladder. Execution quality establishes a meritocratic floor and remains the largest input overall. Text-legible idea quality grades the intermediate rungs. Connections impose a favoritism ceiling that matters most near the apex. Connections operate through two additive channels: connected authors produce higher-scoring papers, and at equal scores their papers still place better. Yet the advantage is bounded, as even the highest-scoring papers face real friction reaching the visible journal ladder. The result nests rather than chooses between me
What carries the argument
The prestige ladder model sequencing execution quality as floor, text-legible idea quality as rungs, and connections as ceiling, estimated via a five-input production function for journal placement.
If this is right
- Higher execution quality raises placement odds more than any other single input.
- Idea quality predicts movement across intermediate journal tiers once execution clears the floor.
- Connection index increases placement probability at equal idea and execution scores.
- The connection advantage is largest for the most selective journals.
- Even the highest idea-quality papers encounter barriers to the apex.
- Ordinary connected papers still rarely reach top outlets.
Where Pith is reading between the lines
- The same text-based scoring method could test whether the floor-rungs-ceiling sequence holds in other fields.
- Journals could explore blind idea-quality screens to limit connection effects at the top.
- Interventions could target either the idea-generation channel or the review channel of the connection advantage.
- Changes in the relative weights of these inputs over time would reveal whether publishing is shifting toward greater or lesser meritocracy.
Load-bearing premise
The discipline-trained LLM evaluator scores idea quality from text without seeing author names or outcomes and provides a valid unbiased measure that can be used ahead of publication fate.
What would settle it
If the LLM idea quality scores show no correlation with journal placement after controlling for execution and connections, or if independent blind expert ratings of the same papers produce different rankings that eliminate the connection effect.
read the original abstract
Does scientific publishing reward the quality of ideas or the advantage of connections? The question is universal to prestige-driven science, yet it has resisted decades of study because a paper's quality could not be gauged ahead of its publication fate without using that fate as the yardstick. We break this constraint by measuring a paper's idea quality directly from its text, before publication, using a discipline-trained LLM evaluator that scores the idea without seeing author names or outcomes. Using economics as a case study, we combine this text-legible idea-quality score with an execution-quality rubric, a connection index, an author-ability index, and an off-the-shelf language-model text score to estimate a five-input production function for journal placement across 6,208 economics working papers. The inputs are not rivals but a sequence along the ladder of prestige. Execution sets a meritocratic floor and is the largest input overall. Text-legible idea quality grades the rungs in between. Connections set a favoritism ceiling that bites mainly near the apex, the most selective journals. Connections work through two additive channels: connected authors write papers that score higher, and at equal scores their papers are still more likely to place better. Yet this advantage is bounded. Connections raise the odds of every rung without making the apex the typical outcome for ordinary ideas, and even the highest-scoring papers face real friction reaching the visible journal ladder. The result nests, rather than chooses between, the meritocracy and network accounts of how science is published.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that a discipline-trained LLM can score the idea quality of economics working papers directly from text (without author names or outcomes), and when combined with execution quality, a connection index, author-ability index, and an off-the-shelf LM text score, a five-input production function estimated on 6,208 working papers shows execution setting a meritocratic floor (largest input overall), idea quality grading intermediate rungs, and connections setting a favoritism ceiling that operates mainly at the most selective journals via two additive channels: connected authors produce higher-scoring papers and, conditional on score, still place better.
Significance. If the LLM score is shown to be a valid, independent measure of idea quality, the decomposition would provide a novel empirical nesting of meritocratic and network accounts of journal placement, with quantitative estimates of each input's contribution and bounds on the scope of favoritism. The approach could generalize beyond economics if the core measurement innovation holds.
major comments (3)
- [Abstract, §3] Abstract and §3 (LLM evaluator): the claim that the discipline-trained LLM provides a valid, pre-publication measure of idea quality independent of networks or publication outcomes is load-bearing for the entire decomposition, yet no training corpus details, fine-tuning procedure, correlation with blinded expert ratings, or out-of-sample predictive checks are reported. Without these, the score may internalize prestige-correlated text features.
- [§4] §4 (production function): the five-input function is estimated on the same sample used to construct the inputs (idea quality, execution, connections, etc.), with no mention of hold-out validation, external benchmarks, or robustness to alternative functional forms; this directly raises the circularity risk noted in the stress test and undermines the reported sequence of effect sizes.
- [§5] §5 (results on connections): the two-channel claim (higher scores plus residual placement advantage) is central to the 'additive' conclusion, but the manuscript supplies no explicit test separating whether the residual channel reflects favoritism versus unmeasured execution or idea dimensions that the LLM rubric misses.
minor comments (2)
- [Table 1, Figure 2] Table 1 and Figure 2: variable definitions and scaling for the connection index and author-ability index should be stated explicitly so readers can assess overlap with the LLM score.
- [Data section] The abstract states '6,208 economics working papers' but does not specify the sampling frame or exclusion criteria; this belongs in the data section for replicability.
Simulated Author's Rebuttal
We are grateful to the referee for highlighting key areas where the manuscript's claims require stronger supporting evidence. Below we respond to each major comment and indicate planned revisions.
read point-by-point responses
-
Referee: [Abstract, §3] Abstract and §3 (LLM evaluator): the claim that the discipline-trained LLM provides a valid, pre-publication measure of idea quality independent of networks or publication outcomes is load-bearing for the entire decomposition, yet no training corpus details, fine-tuning procedure, correlation with blinded expert ratings, or out-of-sample predictive checks are reported. Without these, the score may internalize prestige-correlated text features.
Authors: The referee correctly notes that details on the LLM training are not fully reported. We will revise §3 to provide the training corpus details and fine-tuning procedure. However, we did not collect a separate set of blinded expert ratings for correlation analysis, limiting our ability to add that specific check. revision: partial
-
Referee: [§4] §4 (production function): the five-input function is estimated on the same sample used to construct the inputs (idea quality, execution, connections, etc.), with no mention of hold-out validation, external benchmarks, or robustness to alternative functional forms; this directly raises the circularity risk noted in the stress test and undermines the reported sequence of effect sizes.
Authors: We will add hold-out validation by splitting the sample and re-estimating the production function on the training subset to predict on the hold-out, along with checks for alternative functional forms. This will be included in the revised manuscript to address the circularity concern. revision: yes
-
Referee: [§5] §5 (results on connections): the two-channel claim (higher scores plus residual placement advantage) is central to the 'additive' conclusion, but the manuscript supplies no explicit test separating whether the residual channel reflects favoritism versus unmeasured execution or idea dimensions that the LLM rubric misses.
Authors: To better isolate the residual channel, we will include an additional test in §5 that examines the connection effect within subsamples where the LLM idea quality and execution scores are both high, to see if the placement advantage remains. This provides an indirect test against unmeasured quality dimensions. revision: yes
- Correlation with blinded expert ratings, as no such validation was performed beyond the training annotations.
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper constructs a text-based idea-quality score via a discipline-trained LLM asserted to operate without author names or publication outcomes, then combines this with four other inputs (execution rubric, connection index, author-ability index, off-the-shelf LM score) to estimate a five-input production function explaining journal placement on 6,208 papers. This is a standard regression of observed outcomes on independently constructed features; no equation reduces the LLM score or production-function coefficients to the journal outcome by construction, no self-citation chain is load-bearing, and no fitted parameter is relabeled as an out-of-sample prediction. The central decomposition therefore rests on the asserted independence of the LLM evaluator rather than on any definitional or statistical tautology internal to the reported estimates.
Axiom & Free-Parameter Ledger
free parameters (1)
- production function coefficients
axioms (1)
- domain assumption The LLM evaluator measures idea quality independently of author identity and publication outcome.
Reference graph
Works this paper leans on
-
[1]
(2018) Science of science
Fortunato S, et al. (2018) Science of science. Science 359(6379):eaao0185
2018
-
[2]
Science 159(3810):56–63
Merton RK (1968) The Matthew effect in science. Science 159(3810):56–63
1968
-
[3]
J Polit Econ 102(1):194–203
Laband DN, Piette MJ (1994) Favoritism versus search for good papers: Empirical evidence regarding the behavior of journal editors. J Polit Econ 102(1):194–203
1994
-
[4]
J Financ Econ 111(1):251–270
Brogaard J, Engelberg J, Parsons CA (2014) Networks and productivity: Causal evidence from editor rotations. J Financ Econ 111(1):251–270
2014
-
[5]
Rev Econ Stat 100(1):45–50
Colussi T (2018) Social ties in academia: A friend is a treasure. Rev Econ Stat 100(1):45–50
2018
-
[6]
Medoff MH (2003) Editorial favoritism in economics? South Econ J 70(2):425–434
2003
-
[7]
J Polit Econ 132(9):2999–3024
Carrell SE, Figlio DN, Lusher L (2024) Clubs and networks in economics reviewing. J Polit Econ 132(9):2999–3024
2024
-
[8]
J Econ Lit 51(1):144–161
Card D, DellaVigna S (2013) Nine facts about top journals in economics. J Econ Lit 51(1):144–161
2013
-
[9]
Rev Econ Stat 102(1):195–217
Card D, DellaVigna S (2020) What do editors maximize? Evidence from four economics journals. Rev Econ Stat 102(1):195–217
2020
-
[10]
Res Policy 46(8):1416–1436
Wang J, Veugelers R, Stephan P (2017) Bias against novelty in science: A cautionary tale for users of bibliometric indicators. Res Policy 46(8):1416–1436
2017
-
[11]
J Polit Econ 110(5):994–1034
Ellison G (2002) Evolving standards for academic publishing: A q-r theory. J Polit Econ 110(5):994–1034
2002
-
[12]
Econ J 132(648):2951–2991
Hengel E (2022) Publishing while female: Are women held to higher standards? Evidence from peer review. Econ J 132(648):2951–2991
2022
-
[13]
Proc Natl Acad Sci USA 114(48):12708–12713
Tomkins A, Zhang M, Heavlin WD (2017) Reviewer bias in single- versus double-blind peer review. Proc Natl Acad Sci USA 114(48):12708–12713
2017
-
[14]
LLMs learn scientific taste from institutional traces across the social sciences
Gong Z, Li N, Zhou H (2026) LLMs learn scientific taste from institutional traces across the social sciences. arXiv:2603.16659
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[15]
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Zheng L, et al. (2023) Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. arXiv:2306.05685
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[16]
(2024) Can large language models provide useful feedback on research papers? A large-scale empirical analysis
Liang W, et al. (2024) Can large language models provide useful feedback on research papers? A large-scale empirical analysis. NEJM AI 1(8):AIoa2400196
2024
-
[17]
Pataranutaporn P, Powdthavee N, Achiwaranguprok C, Maes P (2025) Can AI solve the peer review crisis? A large- scale cross-model experiment of LLMs’ performance and biases in evaluating over 1000 economics papers. arXiv:2502.00070
-
[18]
J Econ Lit 58(2):419–470
Heckman JJ, Moktan S (2020) Publishing and promotion in economics: The tyranny of the top five. J Econ Lit 58(2):419–470
2020
-
[19]
Science 214(4523):881–886
Cole S, Cole JR, Simon GA (1981) Chance and consensus in peer review. Science 214(4523):881–886
1981
-
[20]
Rev Econ Stat 96(5):936–948
Ductor L, Fafchamps M, Goyal S, van der Leij M (2014) Social networks and research output. Rev Econ Stat 96(5):936–948
2014
-
[21]
J Am Stat Assoc 103(484):1481–1495
Anderson ML (2008) Multiple inference and gender differences in the effects of early intervention. J Am Stat Assoc 103(484):1481–1495
2008
-
[22]
(2023) G-Eval: NLG evaluation using GPT-4 with better human alignment
Liu Y, et al. (2023) G-Eval: NLG evaluation using GPT-4 with better human alignment. Proc 2023 Conf Empir Methods Nat Lang Process (EMNLP) 2511–2522
2023
-
[23]
Finetuned Language Models Are Zero-Shot Learners
Wei J, et al. (2021) Finetuned language models are zero-shot learners. arXiv:2109.01652
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[24]
Training language models to follow instructions with human feedback
Ouyang L, et al. (2022) Training language models to follow instructions with human feedback. arXiv:2203.02155
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[25]
Li N (2026) The ideation bottleneck: Decomposing the quality gap between AI-generated and human economics research. arXiv:2604.03338
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[26]
J Econ Lit 56(1):115–156
Hamermesh DS (2018) Citations in economics: Measurement, uses, and impacts. J Econ Lit 56(1):115–156
2018
-
[27]
(2018) Low agreement among reviewers evaluating the same NIH grant applications
Pier EL, et al. (2018) Low agreement among reviewers evaluating the same NIH grant applications. Proc Natl Acad Sci USA 115(12):2952–2957
2018
-
[28]
Sci Adv 1(1):e1400005
Clauset A, Arbesman S, Larremore DB (2015) Systematic inequality and hierarchy in faculty hiring networks. Sci Adv 1(1):e1400005
2015
-
[29]
Merit or networks? What decides where research is published
Wuchty S, Jones BF, Uzzi B (2007) The increasing dominance of teams in production of knowledge. Science 316(5827):1036–1039. Supplementary Information for “Merit or networks? What decides where research is published” Numeric citations (N) refer to the reference list in the main text. This section collects the technical detail underlying the Methods and Re...
2007
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.