The Rise of Large Language Models and the Direction and Impact of US Federal Research Funding

Alexander C. Furnas; Dashun Wang; Erzhuo Shao; Yifan Qian; Yue Bai; Zhe Wen

arxiv: 2601.15485 · v2 · submitted 2026-01-21 · 💻 cs.DL · cs.AI· cs.CY· physics.soc-ph

The Rise of Large Language Models and the Direction and Impact of US Federal Research Funding

Yifan Qian , Zhe Wen , Alexander C. Furnas , Yue Bai , Erzhuo Shao , Dashun Wang This is my paper

Pith reviewed 2026-05-16 11:37 UTC · model grok-4.3

classification 💻 cs.DL cs.AIcs.CYphysics.soc-ph

keywords large language modelsfederal research fundingNSFNIHproposal successsemantic distinctivenessscientific productivityAI in science

0 comments

The pith

Greater LLM use in US federal grant proposals reduces their distinctiveness from recent awards and raises success rates at NIH but not at NSF.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tracks the spread of large language models into the writing of proposals submitted to the National Science Foundation and National Institutes of Health. It shows that proposals with higher LLM involvement sit closer in content to work the agencies have already funded in the recent past. This pattern holds in both confidential submissions and the full set of public awards. At NIH the same proposals enjoy higher funding odds and produce more papers afterward, while the same measures show no comparable lift at NSF. The productivity increase at NIH appears mainly in ordinary rather than highly cited papers.

Core claim

Across both private submissions and public awards, higher LLM involvement is consistently associated with lower semantic distinctiveness, positioning projects closer to recently funded work within the same agency. LLM use is positively associated with proposal success and higher subsequent publication output at NIH, whereas no comparable associations are observed at NSF. The productivity gains at NIH concentrate in non-hit papers rather than the most highly cited work.

What carries the argument

Automated detection of LLM involvement in proposal text paired with a measure of semantic distinctiveness that places each proposal relative to the agency’s recently funded portfolio.

If this is right

Proposals with more LLM content cluster around existing research lines inside each agency.
At NIH, LLM-assisted proposals show higher funding rates and generate more follow-on papers.
The extra papers produced at NIH from LLM use are concentrated among average rather than top-cited work.
The shift may narrow the range of ideas that enter the federal research portfolio over time.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Agencies could develop new review criteria focused on originality if LLM homogenization continues.
NSF researchers might need different guidance on LLM use than NIH researchers to capture any benefits.
Tracking whether distinctiveness continues to decline would test whether the pattern persists beyond 2023.
Extending the same analysis to other federal agencies would show whether the NIH-only productivity link is field-specific.

Load-bearing premise

The automated measure of LLM involvement accurately reflects real use in proposal writing and the observed links to success and output are not mainly caused by differences in topic, researcher experience, or overall proposal quality.

What would settle it

Re-running the analysis after adding controls for proposal topic, researcher publication history, and proposal length that eliminates the positive association between detected LLM use and NIH funding success would falsify the central claim.

Figures

Figures reproduced from arXiv: 2601.15485 by Alexander C. Furnas, Dashun Wang, Erzhuo Shao, Yifan Qian, Yue Bai, Zhe Wen.

**Figure 1.** Figure 1: Rapid rise and bimodal distribution of LLM use in US federal research funding. (a-d) Corpus-level estimates of LLM use (𝛼) for private and public NSF and NIH grants from 2021 to 2025, computed using rolling three-month windows (points). Solid lines show locally weighted regressions. The vertical dashed line marks November 30, 2022, corresponding to the public release of ChatGPT. (e-h) Distributions of indi… view at source ↗

**Figure 2.** Figure 2: LLM use and semantic distinctiveness in US federal research funding. (a-d) Regression estimates relating grant-level LLM use (𝛼) to semantic distance from abstracts funded in the prior year within the same agency, expressed as within-year percentiles. Panels show results separately for private NSF (a), private NIH (b), public NSF (c), and public NIH (d) grants. All regressions include grant start year, fie… view at source ↗

**Figure 3.** Figure 3: LLM use and federal research proposal success. Based on private NSF and NIH proposal submissions from two large US R1 universities, this figure examines the relationship between LLM use at submission (𝛼) and proposal success. (a) Regression estimates for NSF submissions. (b) Corresponding estimates for NIH submissions. All regressions include proposal request start year, field, and investigator fixed effec… view at source ↗

**Figure 4.** Figure 4: LLM use and federal research funding outputs. (a-b) Regression estimates relating grant-level LLM use (𝛼) to the total number of resulting publications for NSF (a) and NIH (b) grants. (c-d) Corresponding estimates for high-impact outputs, where a “hit” paper is defined as one whose citations fall within the top 5% of all papers published worldwide in the same year and field. All regressions include grant s… view at source ↗

read the original abstract

Federal research funding shapes the direction, diversity, and impact of the US scientific enterprise. Large language models (LLMs) are rapidly diffusing into scientific practice, holding substantial promise while raising widespread concerns. Despite growing attention to AI use in scientific writing and evaluation, little is known about how the rise of LLMs is reshaping the public funding landscape. Here, we examine LLM involvement at key stages of the federal funding pipeline by combining two complementary data sources: confidential National Science Foundation (NSF) and National Institutes of Health (NIH) proposal submissions from two large US R1 universities, including funded, unfunded, and pending proposals, and the full population of publicly released NSF and NIH awards. We find that LLM use rises sharply beginning in 2023 and exhibits a bimodal distribution, indicating a clear split between minimal and substantive use. Across both private submissions and public awards, higher LLM involvement is consistently associated with lower semantic distinctiveness, positioning projects closer to recently funded work within the same agency. The consequences of this shift are agency-dependent. LLM use is positively associated with proposal success and higher subsequent publication output at NIH, whereas no comparable associations are observed at NSF. Notably, the productivity gains at NIH are concentrated in non-hit papers rather than the most highly cited work. Together, these findings provide large-scale evidence that the rise of LLMs is reshaping how scientific ideas are positioned, selected, and translated into publicly funded research, with implications for portfolio governance, research diversity, and the long-run impact of science.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's real contribution is the novel data merge of confidential proposals and public awards to track LLM use in federal funding, but the claims rest on an unvalidated detection method that needs scrutiny.

read the letter

The main takeaway is that this is the first study to combine confidential NSF and NIH proposal submissions from two R1 universities with the full set of public awards to measure LLM involvement in the funding pipeline. It documents a sharp rise in use starting in 2023, a bimodal split between light and heavy users, and a consistent link between higher LLM involvement and lower semantic distinctiveness—projects that sit closer to recently funded work in the same agency. At NIH there are also positive associations with proposal success and later publication output, though those gains appear in non-hit papers; NSF shows no such patterns. That agency split and the data scale are the parts worth paying attention to.

Referee Report

2 major / 1 minor

Summary. The paper examines the diffusion of large language models into US federal research funding by analyzing confidential proposal submissions from two R1 universities (funded, unfunded, and pending) alongside the full population of public NSF and NIH awards. It reports a sharp rise in LLM involvement beginning in 2023 with a bimodal distribution, consistent associations between higher LLM use and lower semantic distinctiveness (positioning proposals closer to recently funded work), and agency-specific downstream effects: positive associations with proposal success and subsequent publication output at NIH (concentrated in non-hit papers) but no comparable associations at NSF.

Significance. If the central measurement of LLM involvement proves valid, the study supplies large-scale observational evidence that LLMs are altering how research ideas are positioned and selected in the federal funding pipeline, with implications for research diversity, portfolio governance, and long-run scientific impact. The NIH–NSF contrast offers a concrete basis for agency-specific policy discussion.

major comments (2)

[Methods] Methods section: The automated detection of LLM involvement is described only at a high level. No details are supplied on the classifier architecture, training data, decision thresholds, human validation metrics (precision/recall/F1), or tests distinguishing substantive content generation from low-perplexity or formulaic writing. Because every reported association (distinctiveness, success, output) rests on this measure, the absence of these diagnostics prevents evaluation of whether the detector captures LLM use or proxies for proposal quality or topic conventionality.
[Results] Results section (associations with success and output): The manuscript reports positive associations at NIH but provides no information on sample sizes, regression specifications, controls for confounders (topic fixed effects, PI experience, proposal length, prior funding), or robustness checks (alternative specifications, subsample analyses). Without these, it is impossible to determine whether the NIH-specific effects are attributable to LLM use or to unmeasured selection.

minor comments (1)

[Abstract] Abstract: The time window and exact number of proposals/awards analyzed are not stated, making it difficult for readers to gauge the scale and recency of the data.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments, which identify key areas where additional detail will improve the transparency and interpretability of our findings. We address each major comment below and have prepared revisions to incorporate the requested information.

read point-by-point responses

Referee: [Methods] Methods section: The automated detection of LLM involvement is described only at a high level. No details are supplied on the classifier architecture, training data, decision thresholds, human validation metrics (precision/recall/F1), or tests distinguishing substantive content generation from low-perplexity or formulaic writing. Because every reported association (distinctiveness, success, output) rests on this measure, the absence of these diagnostics prevents evaluation of whether the detector captures LLM use or proxies for proposal quality or topic conventionality.

Authors: We agree that the methods section would benefit from greater specificity on the LLM detection procedure. The original manuscript intentionally kept this description concise to focus on the substantive results, but we recognize this limits evaluation of the measure. In the revised manuscript we will add a new subsection that specifies the classifier architecture (a fine-tuned RoBERTa model), the composition of the training data (a hand-annotated corpus drawn from the same university proposal pool), the probability threshold used for classification, and the human validation metrics obtained on a held-out test set. We will also report auxiliary checks that compare perplexity and embedding distances between high- and low-scoring proposals to help distinguish substantive LLM assistance from formulaic or low-perplexity writing. These additions directly address the referee’s concern about the validity of the central measure. revision: yes
Referee: [Results] Results section (associations with success and output): The manuscript reports positive associations at NIH but provides no information on sample sizes, regression specifications, controls for confounders (topic fixed effects, PI experience, proposal length, prior funding), or robustness checks (alternative specifications, subsample analyses). Without these, it is impossible to determine whether the NIH-specific effects are attributable to LLM use or to unmeasured selection.

Authors: We acknowledge that the results section as currently written omits several standard reporting elements that would allow readers to assess the regression analyses. In the revision we will expand the relevant tables and text to report exact sample sizes for each agency-specific analysis, the full set of covariates (including topic fixed effects, PI career stage, proposal length, and prior funding history), and the complete regression specifications. We will also add a dedicated robustness subsection that presents alternative specifications, subsample results, and checks for selection on observables. These changes will make it possible to evaluate whether the reported NIH associations are robust to the confounders the referee identifies. revision: yes

Circularity Check

0 steps flagged

No circularity: purely observational empirical analysis

full rationale

The paper reports statistical associations between LLM involvement (detected via automated methods) and outcomes like semantic distinctiveness, proposal success, and publication output using external proposal and award data. No derivations, equations, fitted parameters renamed as predictions, or self-citations are invoked to justify central claims. The analysis relies on direct measurement from data sources rather than any self-referential construction, satisfying the criteria for a self-contained empirical study with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on abstract only; no explicit free parameters, axioms, or invented entities are described. The analysis depends on empirical text measurements whose technical details are not provided.

pith-pipeline@v0.9.0 · 5598 in / 1032 out tokens · 31353 ms · 2026-05-16T11:37:44.280951+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We leverage an established detection method proposed by Liang et al. ... estimate the fraction of LLM-modified sentences (α) ... SPECTER2 embeddings ... average cosine distance ... within-year percentiles ... OLS regressions with investigator, field, and year fixed effects
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

higher LLM involvement is consistently associated with lower semantic distinctiveness ... agency-dependent effects on proposal success and publication output

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages

[1]

Science, the endless frontier: A report to the president on a program for postwar scientific research (1945)

Bush, V. Science, the endless frontier: A report to the president on a program for postwar scientific research (1945)

work page 1945
[2]

Stephan, P.How Economics Shapes Science(Harvard University Press, 2012)

work page 2012
[3]

& Sampat, B

Li, D., Azoulay, P. & Sampat, B. N. The applied value of public investments in biomedical research.Science 356, 78–81 (2017)

work page 2017
[4]

M., Khanuja, N

Galkina Cleary, E., Beierlein, J. M., Khanuja, N. S., McNamee, L. M. & Ledley, F. D. Contribution of NIH funding to new drug approvals 2010–2016.Proceedings of the National Academy of Sciences115, 2329–2334 (2018)

work page 2010
[5]

S., Li, D

Azoulay, P., Graff Zivin, J. S., Li, D. & Sampat, B. N. Public R&D investments and private-sector patenting: Evidence from NIH funding rules.The Review of Economic Studies86, 117–152 (2019)

work page 2019
[6]

& Yao, D

Fleming, L., Greene, H., Li, G., Marx, M. & Yao, D. Government-funded research increasingly fuels innovation.Science364, 1139–1141 (2019)

work page 2019
[7]

& Jones, B

Yin, Y., Dong, Y., Wang, K., Wang, D. & Jones, B. F. Public use and public funding of science.Nature Human Behaviour6, 1344–1350 (2022)

work page 2022
[8]

& Sampat, B

Azoulay, P., Clancy, M., Li, D. & Sampat, B. N. What if NIH had been 40% smaller?Science389, 1303–1305 (2025)

work page 2025
[9]

C., Fishman, N., Rosenstiel, L

Furnas, A. C., Fishman, N., Rosenstiel, L. & Wang, D. Partisan disparities in the funding of science in the united states.Science389, 1195–1200 (2025)

work page 2025
[10]

Wang, Y.et al.Funding the frontier: Visualizing the broad impact of science and science funding.arXiv preprint arXiv:2509.16323(2025)

work page arXiv 2025
[11]

Yin, Y., Wang, Y., Evans, J. A. & Wang, D. Quantifying the dynamics of failure across science, startups and security.Nature575, 190–194 (2019)

work page 2019
[12]

Wang, Y., Jones, B. F. & Wang, D. Early-career setback and future career impact.Nature Communications 10, 4331 (2019)

work page 2019
[13]

& Agha, L

Li, D. & Agha, L. Big names or big ideas: Do peer-review panels select the best science proposals?Science 348, 434–438 (2015)

work page 2015
[14]

S., Fosse, H

Peng, H., Qiu, H. S., Fosse, H. B. & Uzzi, B. Promotional language and the adoption of innovative ideas in science.Proceedings of the National Academy of Sciences121, e2320066121 (2024)

work page 2024
[15]

Liang, W.et al.Quantifying large language model usage in scientific papers.Nature Human Behaviour1–11 (2025)

work page 2025
[16]

Kusumegi, K.et al.Scientific production in the era of large language models.Science390, 1240–1243 (2025). 39

work page 2025
[17]

InInternational Conference on Machine Learning (ICML)(2024)

Liang, W.et al.Monitoring AI-modified content at scale: A case study on the impact of chatgpt on AI conference peer reviews. InInternational Conference on Machine Learning (ICML)(2024)

work page 2024
[18]

Liang, W.et al.The widespread adoption of large language model-assisted writing across society.Patterns (2025)

work page 2025
[19]

& Lause, J

Kobak, D., Gonz ´alez-M´arquez, R., Horv´at, E.- ´A. & Lause, J. Delving into LLM-assisted writing in biomedical publications through excess vocabulary.Science Advances11, eadt3813 (2025)

work page 2025
[20]

Liu, J., He, Y., Zheng, Z., Bu, Y. & Ni, C. AI-assisted writing is growing fastest among non-english-speaking and less established scientists.arXiv preprint arXiv:2511.15872(2025)

work page arXiv 2025
[21]

& Teplitskiy, M

Bao, H., Sun, M. & Teplitskiy, M. Where there’s a will there’s a way: Chatgpt is used more for science in countries where it is prohibited.Quantitative Science Studies1–16 (2025)

work page 2025
[22]

Wang, H.et al.Scientific discovery in the age of artificial intelligence.Nature620, 47–60 (2023)

work page 2023
[23]

& Wang, D

Gao, J. & Wang, D. Quantifying the use and potential benefits of artificial intelligence in scientific research. Nature Human Behaviour8, 2281–2292 (2024)

work page 2024
[24]

& Imas, A

Jabarian, B. & Imas, A. Artificial writing and automated detection.National Bureau of Economic Research (2025)

work page 2025
[25]

L., Pak, J

Swanson, K., Wu, W., Bulaong, N. L., Pak, J. E. & Zou, J. The virtual lab of AI agents designs new SARS-CoV-2 nanobodies.Nature646, 716–723 (2025)

work page 2025
[26]

& Evans, J

Hao, Q., Xu, F., Li, Y. & Evans, J. Artificial intelligence tools expand scientists’ impact but contract science’s focus.Nature1–7 (2026)

work page 2026
[27]

death of the renaissance man

Jones, B. F. The burden of knowledge and the “death of the renaissance man”: Is innovation getting harder? The Review of Economic Studies76, 283–317 (2009)

work page 2009
[28]

Hill, R.et al.The pivot penalty in research.Nature1–8 (2025)

work page 2025
[29]

Liu, L., Dehmamy, N., Chown, J., Giles, C. L. & Wang, D. Understanding the onset of hot streaks across artistic, cultural, and scientific careers.Nature Communications12, 5392 (2021)

work page 2021
[30]

Tripodi, G.et al.Tenure and research trajectories.Proceedings of the National Academy of Sciences122, e2500322122 (2025)

work page 2025
[31]

Shao, E.et al.Sciscigpt: advancing human–AI collaboration in the science of science.Nature Computational Science1–15 (2025)

work page 2025
[32]

Bail, C. A. Can generative AI improve social science?Proceedings of the National Academy of Sciences121, e2314021121 (2024)

work page 2024
[33]

Musslick, S.et al.Automating the practice of science: Opportunities, challenges, and implications.Proceed- ings of the National Academy of Sciences122, e2401238121 (2025). 40

work page 2025
[34]

Doshi, A. R. & Hauser, O. P. Generative AI enhances individual creativity but reduces the collective diversity of novel content.Science Advances10, eadn5290 (2024)

work page 2024
[35]

March, J. G. Exploration and exploitation in organizational learning.Organization Science2, 71–87 (1991)

work page 1991
[36]

Artificial intelligence in research and development.National Bureau of Economic Research(2025)

Jones, B. Artificial intelligence in research and development.National Bureau of Economic Research(2025)

work page 2025
[37]

& Fleming, L

Scharfmann, E., Marx, M. & Fleming, L. Pasteur’s quadrant researchers bring novelty, impact to publishing, and patenting.Science390, 891–893 (2025)

work page 2025
[38]

& Feldman, S

Singh, A., D’ Arcy, M., Cohan, A., Downey, D. & Feldman, S. Scirepeval: A multi-format benchmark for scientific document representations. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 5548–5566 (2023)

work page 2023
[39]

& Konkiel, S

Herzog, C., Hook, D. & Konkiel, S. Dimensions: Bringing down barriers between scientometricians and data. Quantitative Science Studies1, 387–395 (2020)

work page 2020
[40]

& Feldman, S

Singh, A., D’ Arcy, M., Cohan, A., Downey, D. & Feldman, S. Scirepeval: A multi-format benchmark for scientific document representations. InConference on Empirical Methods in Natural Language Processing (2022)

work page 2022
[41]

A new readability yardstick.Journal of Applied Psychology32, 221 (1948)

Flesch, R. A new readability yardstick.Journal of Applied Psychology32, 221 (1948)

work page 1948
[42]

& Budgell, B

Millar, N., Batalo, B. & Budgell, B. Trends in the use of promotional language (hype) in abstracts of successful national institutes of health grant applications, 1985-2020.JAMA Network Open5, e2228676–e2228676 (2022). 41

work page 1985

[1] [1]

Science, the endless frontier: A report to the president on a program for postwar scientific research (1945)

Bush, V. Science, the endless frontier: A report to the president on a program for postwar scientific research (1945)

work page 1945

[2] [2]

Stephan, P.How Economics Shapes Science(Harvard University Press, 2012)

work page 2012

[3] [3]

& Sampat, B

Li, D., Azoulay, P. & Sampat, B. N. The applied value of public investments in biomedical research.Science 356, 78–81 (2017)

work page 2017

[4] [4]

M., Khanuja, N

Galkina Cleary, E., Beierlein, J. M., Khanuja, N. S., McNamee, L. M. & Ledley, F. D. Contribution of NIH funding to new drug approvals 2010–2016.Proceedings of the National Academy of Sciences115, 2329–2334 (2018)

work page 2010

[5] [5]

S., Li, D

Azoulay, P., Graff Zivin, J. S., Li, D. & Sampat, B. N. Public R&D investments and private-sector patenting: Evidence from NIH funding rules.The Review of Economic Studies86, 117–152 (2019)

work page 2019

[6] [6]

& Yao, D

Fleming, L., Greene, H., Li, G., Marx, M. & Yao, D. Government-funded research increasingly fuels innovation.Science364, 1139–1141 (2019)

work page 2019

[7] [7]

& Jones, B

Yin, Y., Dong, Y., Wang, K., Wang, D. & Jones, B. F. Public use and public funding of science.Nature Human Behaviour6, 1344–1350 (2022)

work page 2022

[8] [8]

& Sampat, B

Azoulay, P., Clancy, M., Li, D. & Sampat, B. N. What if NIH had been 40% smaller?Science389, 1303–1305 (2025)

work page 2025

[9] [9]

C., Fishman, N., Rosenstiel, L

Furnas, A. C., Fishman, N., Rosenstiel, L. & Wang, D. Partisan disparities in the funding of science in the united states.Science389, 1195–1200 (2025)

work page 2025

[10] [10]

Wang, Y.et al.Funding the frontier: Visualizing the broad impact of science and science funding.arXiv preprint arXiv:2509.16323(2025)

work page arXiv 2025

[11] [11]

Yin, Y., Wang, Y., Evans, J. A. & Wang, D. Quantifying the dynamics of failure across science, startups and security.Nature575, 190–194 (2019)

work page 2019

[12] [12]

Wang, Y., Jones, B. F. & Wang, D. Early-career setback and future career impact.Nature Communications 10, 4331 (2019)

work page 2019

[13] [13]

& Agha, L

Li, D. & Agha, L. Big names or big ideas: Do peer-review panels select the best science proposals?Science 348, 434–438 (2015)

work page 2015

[14] [14]

S., Fosse, H

Peng, H., Qiu, H. S., Fosse, H. B. & Uzzi, B. Promotional language and the adoption of innovative ideas in science.Proceedings of the National Academy of Sciences121, e2320066121 (2024)

work page 2024

[15] [15]

Liang, W.et al.Quantifying large language model usage in scientific papers.Nature Human Behaviour1–11 (2025)

work page 2025

[16] [16]

Kusumegi, K.et al.Scientific production in the era of large language models.Science390, 1240–1243 (2025). 39

work page 2025

[17] [17]

InInternational Conference on Machine Learning (ICML)(2024)

Liang, W.et al.Monitoring AI-modified content at scale: A case study on the impact of chatgpt on AI conference peer reviews. InInternational Conference on Machine Learning (ICML)(2024)

work page 2024

[18] [18]

Liang, W.et al.The widespread adoption of large language model-assisted writing across society.Patterns (2025)

work page 2025

[19] [19]

& Lause, J

Kobak, D., Gonz ´alez-M´arquez, R., Horv´at, E.- ´A. & Lause, J. Delving into LLM-assisted writing in biomedical publications through excess vocabulary.Science Advances11, eadt3813 (2025)

work page 2025

[20] [20]

Liu, J., He, Y., Zheng, Z., Bu, Y. & Ni, C. AI-assisted writing is growing fastest among non-english-speaking and less established scientists.arXiv preprint arXiv:2511.15872(2025)

work page arXiv 2025

[21] [21]

& Teplitskiy, M

Bao, H., Sun, M. & Teplitskiy, M. Where there’s a will there’s a way: Chatgpt is used more for science in countries where it is prohibited.Quantitative Science Studies1–16 (2025)

work page 2025

[22] [22]

Wang, H.et al.Scientific discovery in the age of artificial intelligence.Nature620, 47–60 (2023)

work page 2023

[23] [23]

& Wang, D

Gao, J. & Wang, D. Quantifying the use and potential benefits of artificial intelligence in scientific research. Nature Human Behaviour8, 2281–2292 (2024)

work page 2024

[24] [24]

& Imas, A

Jabarian, B. & Imas, A. Artificial writing and automated detection.National Bureau of Economic Research (2025)

work page 2025

[25] [25]

L., Pak, J

Swanson, K., Wu, W., Bulaong, N. L., Pak, J. E. & Zou, J. The virtual lab of AI agents designs new SARS-CoV-2 nanobodies.Nature646, 716–723 (2025)

work page 2025

[26] [26]

& Evans, J

Hao, Q., Xu, F., Li, Y. & Evans, J. Artificial intelligence tools expand scientists’ impact but contract science’s focus.Nature1–7 (2026)

work page 2026

[27] [27]

death of the renaissance man

Jones, B. F. The burden of knowledge and the “death of the renaissance man”: Is innovation getting harder? The Review of Economic Studies76, 283–317 (2009)

work page 2009

[28] [28]

Hill, R.et al.The pivot penalty in research.Nature1–8 (2025)

work page 2025

[29] [29]

Liu, L., Dehmamy, N., Chown, J., Giles, C. L. & Wang, D. Understanding the onset of hot streaks across artistic, cultural, and scientific careers.Nature Communications12, 5392 (2021)

work page 2021

[30] [30]

Tripodi, G.et al.Tenure and research trajectories.Proceedings of the National Academy of Sciences122, e2500322122 (2025)

work page 2025

[31] [31]

Shao, E.et al.Sciscigpt: advancing human–AI collaboration in the science of science.Nature Computational Science1–15 (2025)

work page 2025

[32] [32]

Bail, C. A. Can generative AI improve social science?Proceedings of the National Academy of Sciences121, e2314021121 (2024)

work page 2024

[33] [33]

Musslick, S.et al.Automating the practice of science: Opportunities, challenges, and implications.Proceed- ings of the National Academy of Sciences122, e2401238121 (2025). 40

work page 2025

[34] [34]

Doshi, A. R. & Hauser, O. P. Generative AI enhances individual creativity but reduces the collective diversity of novel content.Science Advances10, eadn5290 (2024)

work page 2024

[35] [35]

March, J. G. Exploration and exploitation in organizational learning.Organization Science2, 71–87 (1991)

work page 1991

[36] [36]

Artificial intelligence in research and development.National Bureau of Economic Research(2025)

Jones, B. Artificial intelligence in research and development.National Bureau of Economic Research(2025)

work page 2025

[37] [37]

& Fleming, L

Scharfmann, E., Marx, M. & Fleming, L. Pasteur’s quadrant researchers bring novelty, impact to publishing, and patenting.Science390, 891–893 (2025)

work page 2025

[38] [38]

& Feldman, S

Singh, A., D’ Arcy, M., Cohan, A., Downey, D. & Feldman, S. Scirepeval: A multi-format benchmark for scientific document representations. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 5548–5566 (2023)

work page 2023

[39] [39]

& Konkiel, S

Herzog, C., Hook, D. & Konkiel, S. Dimensions: Bringing down barriers between scientometricians and data. Quantitative Science Studies1, 387–395 (2020)

work page 2020

[40] [40]

& Feldman, S

Singh, A., D’ Arcy, M., Cohan, A., Downey, D. & Feldman, S. Scirepeval: A multi-format benchmark for scientific document representations. InConference on Empirical Methods in Natural Language Processing (2022)

work page 2022

[41] [41]

A new readability yardstick.Journal of Applied Psychology32, 221 (1948)

Flesch, R. A new readability yardstick.Journal of Applied Psychology32, 221 (1948)

work page 1948

[42] [42]

& Budgell, B

Millar, N., Batalo, B. & Budgell, B. Trends in the use of promotional language (hype) in abstracts of successful national institutes of health grant applications, 1985-2020.JAMA Network Open5, e2228676–e2228676 (2022). 41

work page 1985