Data-Driven Evolution of Library and Information Science Research Methods (1990-2022): A Perspective Based on Fine-grained Method Entities
Pith reviewed 2026-06-25 20:09 UTC · model grok-4.3
The pith
Data resources drive the evolution of research methods in Library and Information Science from 1990 to 2022 through a cycle of emergence followed by stability and practical use.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using fine-grained automatic extraction of method entities from LIS papers, the study finds that data resources function as the pivotal driver of methodological evolution in the field, with research methods developing according to a cyclical pattern of emergence followed by stability and practical application.
What carries the argument
Automatic extraction of four categories of fine-grained method entities (algorithms and models, data resources, software and tools, metrics) from paper text, followed by multi-dimensional analysis of their evolution over time, across topics, and within method types.
If this is right
- Data resources exert greater influence on method evolution than algorithms and models, software and tools, or metrics.
- Research methods in LIS follow a repeated cycle of emergence, stability, and practical application.
- The pace and direction of method changes vary across different research topics.
- Distinct evolutionary features appear when method entities are examined inside different categories of research methods.
Where Pith is reading between the lines
- New data resources introduced in the future would likely initiate fresh cycles of method emergence and stabilization.
- The same extraction and cycle-tracking approach could be used to monitor method evolution in neighboring data-heavy fields.
- Knowledge of the cycle might allow earlier identification of which emerging methods will reach widespread practical use.
Load-bearing premise
The automatic extraction process correctly identifies and categorizes the four types of method entities from paper text with sufficient accuracy that the resulting trends reflect actual methodological practices rather than extraction artifacts.
What would settle it
A manual review of a representative sample of the papers that finds frequent mismatches between the automatically extracted entities and the methods actually described in the text would undermine the reported evolutionary trends.
read the original abstract
Since the 1990s, advancements in big data and information technology have increasingly driven data-centric research in the field of Library and Information Science (LIS). To assess the influence of this data-driven research paradigm on the LIS discipline, this study conducts a fine-grained analysis to uncover the evolutionary trends of research methods within the domain. Using academic papers from LIS published between 1990 and 2022, four key categories of data-driven method entities are automatically extracted: algorithms and models, data resources, software and tools, and metrics. Based on these entities, the study examines the evolution of LIS research methods from three dimensions: the characteristics of research method entities over time, their evolution within different research topics, and the evolutionary features of research method entities across various research methods. The findings highlight data resources as a pivotal driver of methodological evolution in LIS, revealing a cyclical pattern of "emergence-stability/practical application" in the development of research methods within the field.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper analyzes evolutionary trends in Library and Information Science (LIS) research methods from 1990-2022 by automatically extracting four categories of method entities (algorithms/models, data resources, software/tools, metrics) from published papers. It examines these entities across time, research topics, and method types, concluding that data resources are the pivotal driver of methodological change and that development follows a cyclical 'emergence-stability/practical application' pattern.
Significance. If the entity extraction is shown to be reliable, the work offers a large-scale, longitudinal view of data-driven shifts in LIS methodology that could inform discipline-level strategy and curriculum design. The scale (1990-2022) and fine-grained entity typology are strengths, but the absence of any accuracy assessment prevents evaluation of whether the reported dominance of data resources or the cyclical pattern reflects genuine practice or extraction artifacts.
major comments (2)
- [Abstract/Methods] Abstract and Methods (entity extraction description): the central claims rest on counts and co-occurrences of automatically extracted entities, yet no precision, recall, F1, confusion matrix, or inter-annotator agreement on a held-out sample is reported. Without these, temporal trends and the 'data resources as pivotal driver' conclusion cannot be distinguished from changes in terminology, model bias, or inconsistent categorization.
- [Results] Results (trend and cycle analysis): the emergence-stability cycle and topic-specific evolution claims are derived directly from the unvalidated entity counts; any systematic error in the extraction pipeline (e.g., higher false-positive rate for recent papers) would propagate into the reported cyclical pattern and the cross-topic comparisons.
minor comments (2)
- [Methods] Clarify the exact NLP model, rules, or pipeline used for entity extraction and whether any post-processing or manual review was applied.
- [Data] Provide the total number of papers processed and the distribution across the 1990-2022 period to allow assessment of sample-size effects on the reported trends.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which highlight an important gap in our reporting. We agree that the lack of quantitative validation for the entity extraction limits the strength of our claims and will revise the manuscript to include a dedicated evaluation.
read point-by-point responses
-
Referee: [Abstract/Methods] Abstract and Methods (entity extraction description): the central claims rest on counts and co-occurrences of automatically extracted entities, yet no precision, recall, F1, confusion matrix, or inter-annotator agreement on a held-out sample is reported. Without these, temporal trends and the 'data resources as pivotal driver' conclusion cannot be distinguished from changes in terminology, model bias, or inconsistent categorization.
Authors: We acknowledge that the current manuscript does not report precision, recall, F1, or inter-annotator agreement for the entity extraction pipeline. This is a substantive limitation that prevents readers from assessing whether the observed dominance of data resources and the cyclical patterns reflect actual methodological shifts or extraction artifacts. In the revised version we will add a new Methods subsection that describes a manual annotation of a held-out sample of 300 papers (stratified by decade and topic), performed by two independent annotators, and report precision, recall, F1, a confusion matrix, and Cohen’s kappa. We will also discuss any systematic biases identified. revision: yes
-
Referee: [Results] Results (trend and cycle analysis): the emergence-stability cycle and topic-specific evolution claims are derived directly from the unvalidated entity counts; any systematic error in the extraction pipeline (e.g., higher false-positive rate for recent papers) would propagate into the reported cyclical pattern and the cross-topic comparisons.
Authors: We agree that the emergence-stability cycle and cross-topic comparisons rest on the raw entity counts and would be sensitive to systematic extraction errors. Once the validation results are available, we will add a Limitations paragraph that quantifies how observed error rates could affect the reported temporal patterns and will, if necessary, re-run the cycle detection on a precision-adjusted subset. This will make the robustness of the cyclical claim explicit. revision: yes
Circularity Check
No circularity: empirical extraction and counting on external corpus
full rationale
The paper extracts four categories of method entities (algorithms/models, data resources, software/tools, metrics) from 1990-2022 LIS papers and reports temporal trends, topic-specific evolution, and cross-method patterns. No equations, fitted parameters, predictions derived from fits, or self-referential definitions appear. The central claim (data resources as driver with emergence-stability cycle) is a direct summary of observed counts and co-occurrences; it does not reduce to any input by construction. No load-bearing self-citations or uniqueness theorems are invoked. This is a standard descriptive analysis of an external corpus; extraction accuracy is a validity issue, not a circularity issue.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The corpus of LIS papers 1990-2022 is a sufficient and unbiased sample for characterizing research method evolution.
Reference graph
Works this paper leans on
-
[1]
Angelov, D. (2020). Top2Vec: Distributed Representations of Topics . arXiv preprint arXiv : 2008.09470. https://doi.org/10.48550/arXiv.2008.09470 Burrough-Boenisch, J. (1999). International Reading Strategies for IMRD Articles. Written Communication, 16(3), 296-316.. https://doi.org/10.1177/0741088399016003002 Chu, H. (2015). Research methods in library a...
-
[2]
Proceedings of the IEEE , author=
Proceedings of the IEEE . https://doi.org/10.1109/5.18626 Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y ., Li, W., & Liu, P. J. (2020). Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research, 21(140), Article
-
[3]
https://doi.org/10.1002/asi.24404 Tang, R., Mehra, B., Du, J. T., & Zhao, Y . (Chris). (2021b). Paradigm shift in the field of information special issue editorial. Journal of the Association for Information Science and Technology, 72(10), Article
-
[4]
https://doi.org/10.1002/asi.24566 Vakkari, P. (2024). What characterizes LIS as a fragmenting discipline? Journal of Documentation, 80(7), 60–77. https://doi.org/10.1108/JD-10-2023-0207 Vakkari, P., Chang, Y ., & Järvelin, K. (2022). Disciplinary contributions to research topics and methodology in Library and Information Science —Leading to fragmentation?...
-
[5]
https://doi.org/10.1007/s11192-023-04740-3 Zhang, C., Tian, L., & Chu, H
Scientometrics, 128(7), 3981-4006. https://doi.org/10.1007/s11192-023-04740-3 Zhang, C., Tian, L., & Chu, H. (2023). Usage frequency and application variety of research methods in library and information science: Continuous investigation from 1991 to
-
[6]
https://doi.org/10.1016/j.ipm.2023.103507 Zhang, C., Wang, F., Huang, Y ., & Chang, L
Information Pro cessing & Management , 60(6), 103507 . https://doi.org/10.1016/j.ipm.2023.103507 Zhang, C., Wang, F., Huang, Y ., & Chang, L. (2023). Interdisciplinarity of information science: An evolutionary perspective of theory application. Journal of Documentation, 80(2), 392–426. https://doi.org/10.1108/JD-07-2023-0135 Zhang, C., Wei, S., Zhao, Y .,...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.