Analyzing Linguistic Complexity and Scientific Impact

Cassidy R. Sugimoto; Chao Lu; Chengzhi Zhang; Jie Wang; Logan Paul; Vincent Larivi\`ere; Xianlei Dong; Yi Bu; Ying Ding

arxiv: 1907.11843 · v1 · pith:DIOB2ZPXnew · submitted 2019-07-27 · 💻 cs.CL · cs.DL· physics.soc-ph

Analyzing Linguistic Complexity and Scientific Impact

Chao Lu , Yi Bu , Xianlei Dong , Jie Wang , Ying Ding , Vincent Larivi\`ere , Cassidy R. Sugimoto , Logan Paul

show 1 more author

Chengzhi Zhang

This is my paper

Pith reviewed 2026-05-24 15:11 UTC · model grok-4.3

classification 💻 cs.CL cs.DLphysics.soc-ph

keywords linguistic complexityscientific writingcitation impactbiologypsychologytextual featurescitation stratascholarly communication

0 comments

The pith

Linguistic complexity of scientific papers shows no practical significant relationship with citation impact.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether the linguistic complexity of writing influences how often scientific articles are cited. It measures 12 specific complexity variables across 36,400 full-text biology articles and 1,797 psychology articles. Each article is placed into high, medium, or low citation groups. The comparison finds no meaningful connection between the complexity scores and citation levels in either field. This leads to the conclusion that textual complexity plays little role in scientific impact for the data examined.

Core claim

By analyzing 36,400 biology articles and 1,797 psychology articles using 12 linguistic complexity variables as a proxy for scientific writing, the study found no practical significant relationship between these features and whether papers fell into high, medium, or low citation categories. This indicates that textual complexity plays little role in scientific impact within the examined data sets.

What carries the argument

The 12 linguistic complexity variables used as a proxy for scientific writing, compared directly against citation strata of high, medium, and low.

If this is right

Textual complexity is not a major driver of citation rates in these two disciplines.
Adjusting the complexity of scientific writing is unlikely to change an article's impact category.
Factors other than language features are more relevant to explaining differences in citations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Writing advice for researchers could shift attention away from complexity toward other elements such as clarity or content novelty.
The absence of a link in biology and psychology raises the question of whether the same pattern appears in fields with different writing conventions.
Impact metrics might better be explained by research substance or collaboration patterns than by surface language traits.

Load-bearing premise

The 12 selected linguistic complexity variables sufficiently capture the relevant aspects of scientific writing complexity, and the categorization into high, medium, and low citation strata is appropriate for detecting relationships.

What would settle it

A replication study that finds a strong statistical association between one or more of the 12 complexity measures and citation strata in a similar sized sample of biology or psychology papers would challenge the central result.

read the original abstract

The number of publications and the number of citations received have become the most common indicators of scholarly success. In this context, scientific writing increasingly plays an important role in scholars' scientific careers. To understand the relationship between scientific writing and scientific impact, this paper selected 12 variables of linguistic complexity as a proxy for depicting scientific writing. We then analyzed these features from 36,400 full-text Biology articles and 1,797 full-text Psychology articles. These features were compared to the scientific impact of articles, grouped into high, medium, and low categories. The results suggested no practical significant relationship between linguistic complexity and citation strata in either discipline. This suggests that textual complexity plays little role in scientific impact in our data sets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Large full-text analysis in biology and psychology finds no practical link between 12 standard linguistic complexity measures and citation strata.

read the letter

The main thing to know is that this paper reports a null result: linguistic complexity shows little connection to whether papers land in high, medium, or low citation groups in either field. They ran the comparison on 36,400 biology articles and 1,797 psychology articles using full text, which gives the finding some weight as a straightforward empirical check rather than a small-sample observation. The work is new in the narrow sense of applying existing complexity variables at this scale to these two disciplines and producing a negative outcome not already documented in the cited prior work. It does the basic job cleanly by sticking to observable features and external citation data without inflating the claim. The negative finding itself is useful for anyone tracking what actually drives impact metrics. Soft spots are mostly around missing details in the abstract: no mention of controls for paper length or subfield differences, which often correlate with both complexity scores and citations; the exact definitions of the 12 variables and the citation bin boundaries are not shown here; and the scope stays limited to two fields and citation-based impact. Those gaps make the support for the central claim thinner than the sample size suggests, but they do not create an internal contradiction or obvious power problem. This is the kind of paper that belongs in a bibliometrics or science-of-science reading group for the data point it adds. I would cite it only if pulling together evidence on null effects in impact studies. It deserves peer review because the empirical setup is clear enough for referees to test the methods directly and see whether the null survives added controls.

Referee Report

3 major / 1 minor

Summary. The manuscript analyzes the relationship between linguistic complexity, proxied by 12 variables, and scientific impact via citation counts in 36,400 Biology and 1,797 Psychology full-text articles. By binning articles into high, medium, and low citation strata, the authors find no practically significant relationship between these complexity features and citation levels in either discipline, concluding that textual complexity plays little role in scientific impact.

Significance. A robust null result on this scale would indicate that linguistic complexity measures do not meaningfully predict citation impact, implying that writing-style interventions are unlikely to boost scholarly success and redirecting attention to factors such as topical novelty or network effects. The Biology sample size provides reasonable power if the analysis is properly specified.

major comments (3)

[Abstract] Abstract: the claim of 'no practical significant relationship' is presented without any description of the statistical tests, effect-size thresholds, confidence intervals, or correction for multiple comparisons used to reach this conclusion.
[Methods] Methods (variable definitions and controls): no information is given on how the 12 linguistic complexity variables were operationalized or computed from the full texts, nor on controls for confounders such as article length, publication year, or subfield, all of which can jointly influence both complexity scores and citation counts.
[Results] Results (citation strata): the boundaries separating high, medium, and low citation categories are not specified, and the analysis does not report sensitivity checks to alternative binning choices despite these boundaries being free parameters.

minor comments (1)

[Abstract] Abstract: the Psychology sample (n=1,797) is an order of magnitude smaller than the Biology sample; power implications for the null result in that discipline should be addressed.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and constructive comments on our manuscript. We address each of the major comments below, indicating the revisions we plan to make.

read point-by-point responses

Referee: [Abstract] Abstract: the claim of 'no practical significant relationship' is presented without any description of the statistical tests, effect-size thresholds, confidence intervals, or correction for multiple comparisons used to reach this conclusion.

Authors: We acknowledge that the abstract, due to its brevity, does not detail the statistical procedures. However, the Methods section describes the use of appropriate non-parametric tests to compare complexity features across citation strata, with effect sizes calculated to assess practical significance (using thresholds such as |r| < 0.1 for negligible effects). We will revise the abstract to include a concise mention of these approaches, including that no multiple comparison correction was needed as the primary analysis was exploratory. revision: yes
Referee: [Methods] Methods (variable definitions and controls): no information is given on how the 12 linguistic complexity variables were operationalized or computed from the full texts, nor on controls for confounders such as article length, publication year, or subfield, all of which can jointly influence both complexity scores and citation counts.

Authors: We agree that additional details on variable operationalization and potential confounders would strengthen the manuscript. In the revised version, we will expand the Methods section to explicitly describe how each of the 12 variables (e.g., sentence length, lexical diversity) was computed using standard NLP tools, and we will include analyses controlling for article length, publication year, and subfield using multivariate regression or stratification. revision: yes
Referee: [Results] Results (citation strata): the boundaries separating high, medium, and low citation categories are not specified, and the analysis does not report sensitivity checks to alternative binning choices despite these boundaries being free parameters.

Authors: The citation strata were defined using tertiles of the citation distribution within each discipline to ensure balanced groups. We will specify these exact boundaries (e.g., low: bottom 33%, medium: middle 33%, high: top 33%) in the revised Results section and add sensitivity analyses using alternative binning methods such as quartiles and median splits to confirm the robustness of the null findings. revision: yes

Circularity Check

0 steps flagged

No circularity: straightforward empirical comparison

full rationale

The paper performs a direct empirical analysis: 12 standard linguistic complexity features are extracted from full-text articles and compared against externally observed citation counts binned into high/medium/low strata. No mathematical derivation, fitted parameters renamed as predictions, self-referential equations, or load-bearing self-citations appear in the reported chain. The null result follows from straightforward statistical comparison to independent citation data and does not reduce to any of the enumerated circular patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The study depends on standard bibliometric assumptions about impact measurement and the choice of linguistic features without introducing new entities or fitted parameters beyond category definitions.

free parameters (1)

citation strata boundaries
The thresholds defining high, medium, and low citation categories are not specified and may be chosen ad hoc or based on data distribution.

axioms (2)

domain assumption Citation counts serve as a reliable proxy for scientific impact
Used to group articles into impact strata for comparison.
domain assumption The 12 linguistic variables adequately measure writing complexity
Selected as proxy for depicting scientific writing.

pith-pipeline@v0.9.0 · 5668 in / 1210 out tokens · 34015 ms · 2026-05-24T15:11:34.967907+00:00 · methodology

Analyzing Linguistic Complexity and Scientific Impact

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)