pith. sign in

arxiv: 2511.21745 · v2 · submitted 2025-11-22 · 💻 cs.DL

AI-Augmented Bibliometric Framework: A Paradigm Shift with Agentic AI for Dynamic, Snippet-Based Research Analysis

Pith reviewed 2026-05-17 06:42 UTC · model grok-4.3

classification 💻 cs.DL
keywords multi-agent AIbibliometric analysisscientometricsnatural language to codedynamic analysisresearch gap synthesisRAG retrievalagentic workflows
0
0 comments X

The pith

A multiagent AI framework translates natural language instructions into safe Python scripts for dynamic scientometric analyses.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a generative multiagent AI framework that lets researchers direct complex bibliometric tasks such as network construction and topic modeling through natural language rather than specialized code or rigid interfaces. Four coordinated agents handle analytics generation, full-paper retrieval with retrieval-augmented generation, and automated reporting while running scripts in a sandbox for safety and reproducibility. A sympathetic reader would care because the approach removes programming barriers and the inflexibility of existing platforms, enabling iterative what-if explorations and custom metrics in one environment. If correct, it supports synthesis of research gaps and frontier themes from large literature collections without static workflows.

Core claim

The paper claims that a system of four coordinated AI agents—an analytics generator, a full-paper retriever with an embedded RAG-based researcher assistant, and an automated report generator—converts natural language queries into executable Python scripts that perform data cleaning, co-authorship and citation network analysis, temporal studies, topic modeling, embedding-based clustering, and research gap synthesis, while delivering multimodal full-paper retrieval and dynamic metric creation in a single adaptive session, features absent from tools such as VOSviewer, Bibliometrix, and SciMAT.

What carries the argument

The multi-agent coordination mechanism that translates natural language queries into sandboxed, executable Python scripts for scientometric tasks, paired with retrieval-augmented generation for full-paper synthesis and report production.

If this is right

  • Researchers without programming skills can execute analyses including co-authorship networks, temporal trends, and embedding-based clustering.
  • The system supports iterative what-if analysis and user-driven modifications to analysis pipelines.
  • Each session automatically produces an exportable end-to-end report that includes research gap synthesis.
  • The framework unifies natural-language-to-code scientometrics with multimodal full-paper retrieval in one adaptive environment.
  • Automated identification of frontier themes becomes possible through dynamic metric creation and topic modeling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same agentic translation pattern could be tested on other quantitative literature tasks such as patent landscape analysis or grant portfolio review.
  • Conversational sessions might allow teams to refine analyses jointly without needing shared technical expertise.
  • Over time the framework could incorporate live data feeds to keep metrics current as new papers appear.
  • Early-career researchers might reach synthesis insights faster by bypassing the need to learn multiple specialized tools.

Load-bearing premise

The AI agents will reliably produce correct, safe, and complete Python scripts for scientometric tasks without hallucinations or errors that affect the analysis results.

What would settle it

Running a complex query for specific network metrics or clustering on a known dataset and checking whether the generated script produces outputs that match independent manual verification or established bibliometric software without discrepancies.

read the original abstract

Our paper introduces a generative, multiagent AI framework designed to overcome the rigidity, limited flexibility and technical barriers of current bibliometric tools. The objective is to enable researchers to perform fully dynamic, code-based scientometric analysis using natural language NL instructions, eliminating the need for specialized programming skills while expanding analytical depth. Methodologically, the system integrates four coordinated AI agents: a custom analytics generator, a full-paper retriever, including a Retrieval Augmented Generation RAG based researcher assistant and an automated report generator. User queries are translated into executable Python scripts, run within a sandbox ensuring safety, reproducibility and auditability. The framework supports automated data cleaning, construction of co-authorship and citation networks, temporal analyses, topic modeling, embedding based clustering and synthesis of research gaps. Each analytical session produces an exportable, end to end report. The novelty lies in unifying NL to code scientometrics, multimodal full paper retrieval, agentic exploration and dynamic metric creation in a single adaptive environment, capabilities absent in existing platforms: VOSviewer, Bibliometrix, SciMAT. Unlike static GUI based workflows, the proposed framework supports iterative what if analysis, hybrid indicators and user driven pipeline modification. Results demonstrate that the framework generates valid analysis scripts, retrieves and synthesizes full papers, identifies frontier themes and produces reproducible scientometric outputs. It establishes a new paradigm for accessible, interactive and extensible bibliometric knowledge.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces a generative multi-agent AI framework for dynamic bibliometric analysis. It integrates four coordinated agents (analytics generator, full-paper retriever with RAG-based assistant, and report generator) that translate natural-language queries into executable Python scripts for tasks including co-authorship/citation network construction, temporal analysis, topic modeling, and embedding-based clustering. Scripts run in a sandbox for safety and reproducibility; the system also retrieves and synthesizes full papers and produces end-to-end reports. The authors claim this unifies NL-to-code scientometrics, multimodal retrieval, and agentic workflows in a way absent from tools such as VOSviewer, Bibliometrix, and SciMAT, enabling iterative what-if analysis and user-driven modifications. Results are stated to demonstrate valid scripts and reproducible outputs.

Significance. If the reliability claims hold, the work could meaningfully advance scientometrics by lowering technical barriers and enabling adaptive, code-based analyses that static GUI tools cannot support. The sandboxed execution model and emphasis on auditability address practical concerns with AI-generated code. The unification of NL interfaces, full-text RAG, and dynamic metric creation represents a coherent extension of existing agentic-AI ideas into the bibliometrics domain.

major comments (2)
  1. [Abstract / Results] Abstract and Results section: The assertion that 'results demonstrate that the framework generates valid analysis scripts... and produces reproducible scientometric outputs' is unsupported by any quantitative evidence (success rates, error rates, comparison baselines, or example outputs). This directly undermines the central claim that the agentic pipeline reliably performs scientometric tasks.
  2. [Methodology (agent descriptions)] Methodology section on agent coordination: No prompt templates, validation routines, or hallucination-mitigation strategies are provided for the analytics generator when producing Python code for network construction (e.g., networkx), clustering (e.g., sklearn), or statistical interpretation. Given documented failure modes of LLM code generation on library usage and data handling, this is load-bearing for the reproducibility and safety guarantees.
minor comments (3)
  1. [Abstract] Abstract: 'end to end report' should read 'end-to-end report'; 'RAG based' should read 'RAG-based'.
  2. [Introduction / Related Work] The novelty comparison to VOSviewer, Bibliometrix, and SciMAT would be strengthened by an explicit feature-comparison table rather than a prose list.
  3. [Throughout] Ensure first-use definitions for all acronyms (RAG, NL) and consistent hyphenation throughout.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments, which help clarify how to strengthen the empirical grounding and reproducibility of our work. We address each major comment below and commit to specific revisions that directly respond to the concerns raised.

read point-by-point responses
  1. Referee: [Abstract / Results] Abstract and Results section: The assertion that 'results demonstrate that the framework generates valid analysis scripts... and produces reproducible scientometric outputs' is unsupported by any quantitative evidence (success rates, error rates, comparison baselines, or example outputs). This directly undermines the central claim that the agentic pipeline reliably performs scientometric tasks.

    Authors: We agree that the current Results section relies primarily on illustrative case studies and example outputs rather than aggregated quantitative metrics. This is a valid observation that weakens the strength of the reliability claim as presented. In the revised manuscript we will expand the Results section with a quantitative evaluation: success rates and error rates measured over a benchmark set of 50 diverse natural-language queries, categorized failure modes, and side-by-side comparisons against Bibliometrix and VOSviewer on metrics such as analysis completion time and flexibility for iterative modifications. These additions will be supported by the sandbox execution logs already collected during development. revision: yes

  2. Referee: [Methodology (agent descriptions)] Methodology section on agent coordination: No prompt templates, validation routines, or hallucination-mitigation strategies are provided for the analytics generator when producing Python code for network construction (e.g., networkx), clustering (e.g., sklearn), or statistical interpretation. Given documented failure modes of LLM code generation on library usage and data handling, this is load-bearing for the reproducibility and safety guarantees.

    Authors: We concur that explicit documentation of the prompts and safeguards is necessary to substantiate the reproducibility and safety claims. The revised manuscript will include a dedicated appendix containing the full prompt templates used by the analytics generator, together with the validation routines (syntax checking via AST parsing, sandboxed execution with automatic error feedback, and library-usage verification) and hallucination-mitigation steps (iterative self-correction prompts and optional human review of generated code before final execution). These details were part of the internal implementation but were omitted from the initial submission for brevity; we will now make them available. revision: yes

Circularity Check

0 steps flagged

No circularity: descriptive framework without derivations or self-referential reductions

full rationale

The manuscript presents a conceptual multi-agent AI framework for natural-language-driven bibliometric analysis. It contains no equations, fitted parameters, mathematical derivations, or load-bearing self-citations. Novelty claims rest on the integration of NL-to-code translation, RAG retrieval, and agent coordination rather than any reduction of outputs to inputs by construction. The central assertions about agent reliability and script validity are empirical claims (unsupported here) but do not form a circular derivation chain. This is a standard non-circular descriptive paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on untested assumptions about AI reliability for code generation and retrieval accuracy rather than on external benchmarks or formal proofs.

axioms (1)
  • domain assumption Large language models can translate natural language instructions into correct and safe Python code for scientometric tasks
    Invoked throughout the description of the analytics generator agent.
invented entities (1)
  • Four coordinated AI agents (analytics generator, full-paper retriever, RAG researcher assistant, report generator) no independent evidence
    purpose: To handle query translation, retrieval, analysis, and reporting in one adaptive environment
    Newly introduced components whose coordination is central to the claimed unification of capabilities.

pith-pipeline@v0.9.0 · 5556 in / 1119 out tokens · 48348 ms · 2026-05-17T06:42:12.936278+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages

  1. [1]

    How to conduct a bibliometric analysis: An overview and guidelines,

    N. Donthu, S. Kumar, D. Mukherjee, N. Pandey, and W. M. Lim, “How to conduct a bibliometric analysis: An overview and guidelines,” J. Bus. Res., 2021, doi: 10.1016/j.jbusres.2021.04.070

  2. [2]

    Guidelines for interpreting the results of bibliometric analysis: A sensemaking approach,

    W. M. Lim and S. Kumar, “Guidelines for interpreting the results of bibliometric analysis: A sensemaking approach,” Glob. Bus. Organ. Excell., 2024, doi: 10.1002/joe.22229

  3. [3]

    How to design bibliometric research: an overview and a framework proposal,

    O. Öztürk, R. Kocaman, and D. K. Kanbach, “How to design bibliometric research: an overview and a framework proposal,” Rev. Manag. Sci., 2024, doi: 10.1007/s11846-024-00738-0

  4. [4]

    Visualizing Research on Explainable Artificial Intelligence for Medical and Healthcare,

    S. Ali, A. S. Imran, Z. Kastrati, and S. M. Daudpota, “Visualizing Research on Explainable Artificial Intelligence for Medical and Healthcare,” in 2023 4th International Conference on Computing, Mathematics and Engineering Technologies: Sustainable Technologies for Socio -Economic Development, iCoMET 2023,

  5. [5]

    doi: 10.1109/iCoMET57998.2023.10099343

  6. [6]

    The big picture on Instagram research: Insights from a bibliometric analysis,

    A. Rejeb, K. Rejeb, A. Abdollahi, and H. Treiblmaier, “The big picture on Instagram research: Insights from a bibliometric analysis,” 2022. doi: 10.1016/j.tele.2022.101876

  7. [7]

    A comprehensive bibliometric study of the balanced scorecard,

    C. Suárez-Gargallo and P. Zaragoza-Sáez, “A comprehensive bibliometric study of the balanced scorecard,” Eval. Program Plann., 2023, doi: 10.1016/j.evalprogplan.2023.102256. 19

  8. [8]

    An Integrated Methodology for Bibliometric Analysis: A Case Study of Internet of Things in Healthcare Applications,

    R. Ullah, I. Asghar, and M. G. Griffiths, “An Integrated Methodology for Bibliometric Analysis: A Case Study of Internet of Things in Healthcare Applications,” Sensors, 2023, doi: 10.3390/s23010067

  9. [9]

    State-of-the-Art of Artificial Intelligence and Big Data Analytics Reviews in Five Different Domains: A Bibliometric Summary,

    P. V. Thayyib et al., “State-of-the-Art of Artificial Intelligence and Big Data Analytics Reviews in Five Different Domains: A Bibliometric Summary,” 2023. doi: 10.3390/su15054026

  10. [10]

    Bibliometric analysis and systematic review of environmental, social, and governance disclosure papers: Current topics and recommendations for future research,

    N. O. D. Ellili, “Bibliometric analysis and systematic review of environmental, social, and governance disclosure papers: Current topics and recommendations for future research,” Environ. Res. Commun., 2022, doi: 10.1088/2515-7620/ac8b67

  11. [11]

    Digital Leadership: A Bibliometric Analysis,

    F. B. Tigre, C. Curado, and P. L. Henriques, “Digital Leadership: A Bibliometric Analysis,” J. Leadersh. Organ. Stud., 2023, doi: 10.1177/15480518221123132

  12. [12]

    Scientometrics , author =

    N. J. van Eck and L. Waltman, “Software survey: VOSviewer, a computer program for bibliometric mapping,” Scientometrics, vol. 84, no. 2, pp. 523–538, 2010, doi: 10.1007/s11192-009-0146-3

  13. [13]

    Chen, Citespace: A practical guide for mapping scientific literature

    C. Chen, Citespace: A practical guide for mapping scientific literature. nova science publishers . 2016

  14. [14]

    Bibliometrix: An R-tool for comprehensive science mapping analysis.Journal of Informetrics, 11(4):959–975, 2017

    M. Aria and C. Cuccurullo, “bibliometrix: An R-tool for comprehensive science mapping analysis,” J. Informetr., vol. 11, no. 4, pp. 959–975, 2017, doi: 10.1016/j.joi.2017.08.007

  15. [15]

    SciMAT: A new science mapping analysis software tool,

    M. J. Cobo, A. G. López-Herrera, E. Herrera-Viedma, and F. Herrera, “SciMAT: A new science mapping analysis software tool,” J. Am. Soc. Inf. Sci. Technol., vol. 63, no. 8, pp. 1609–1630, 2012, doi: 10.1002/asi.22688

  16. [16]

    A comprehensive approach to preprocessing data for bibliometric analysis,

    M. Nowakowska, “A comprehensive approach to preprocessing data for bibliometric analysis,” Scientometrics, vol. 130, no. 9, pp. 5191–5225, 2025, doi: 10.1007/s11192-025-05415-x

  17. [17]

    Ali Abaker Omer and Y

    A. Ali Abaker Omer and Y. Dong, “Mapping the Use of Bibliometric Software and Methodological Transparency in Literature Review Studies: A Comparative Analysis of China -Affiliated and Non-China- Affiliated Research Communities (2015–2024),” Publications, vol. 13, no. 3, 2025, doi: 10.3390/publications13030040

  18. [18]

    Bibliometric analysis of natural language processing using CiteSpace and VOSviewer,

    X. Chen, W. Tian, and H. Fang, “Bibliometric analysis of natural language processing using CiteSpace and VOSviewer,” Nat. Lang. Process. J., vol. 10, p. 100123, 2025, doi: https://doi.org/10.1016/j.nlp.2024.100123

  19. [19]

    DataCite as a novel bibliometric source: Coverage, strengths and limitations,

    N. Robinson-Garcia, P. Mongeon, W. Jeng, and R. Costas, “DataCite as a novel bibliometric source: Coverage, strengths and limitations,” J. Informetr., 2017, doi: 10.1016/j.joi.2017.07.003

  20. [20]

    Hidden limitations of analyses via alternative bibliometric services,

    L. Ansorge, “Hidden limitations of analyses via alternative bibliometric services,” 2023. doi: 10.1007/s11192-022-04626-w

  21. [21]

    ANDez: An open-source tool for author name disambiguation using machine learning,

    J. Kim and J. Kim, “ANDez: An open-source tool for author name disambiguation using machine learning,” SoftwareX, vol. 26, p. 101719, 2024, doi: https://doi.org/10.1016/j.softx.2024.101719

  22. [22]

    Scientometrics of Scientometrics Based on Web of Science Core Collection Data between 1992 and 2020,

    Y. Liu and H. He, “Scientometrics of Scientometrics Based on Web of Science Core Collection Data between 1992 and 2020,” 2023. doi: 10.3390/info14120637

  23. [23]

    Mapping Data -Driven Research Impact Science: The Role of Machine Learning and Artificial Intelligence,

    M. H. Arsalan, O. Mubin, A. Al Mahmud, I. A. Khan, and A. J. Hassan, “Mapping Data -Driven Research Impact Science: The Role of Machine Learning and Artificial Intelligence,” Metrics, vol. 2, no. 2, 2025, doi: 10.3390/metrics2020005

  24. [24]

    Oprea and A

    S.-V. Oprea and A. Bâra, “Unveiling the nexus between energy storage and electricity markets in academic publications. A data-driven analysis of emerging trends and market dynamics using NLP, sentiment analysis and probabilistic modeling,” J. Energy Storage, vol. 106, p. 114917, 2025, doi: https://doi.org/10.1016/j.est.2024.114917

  25. [25]

    Is Artificial Intelligence a Game-Changer in Steering E-Business into the Future? Uncovering Latent Topics with Probabilistic Generative Models,

    S.-V. Oprea and A. Bâra, “Is Artificial Intelligence a Game-Changer in Steering E-Business into the Future? Uncovering Latent Topics with Probabilistic Generative Models,” J. Theor. Appl. Electron. Commer. Res., vol. 20, no. 1, 2025, doi: 10.3390/jtaer20010016

  26. [26]

    Bibliometric Analysis Augmented by Artificial Intelligence: Implementation of pyBibX and a Practical Guide,

    Balıkçı and H. Celal, “Bibliometric Analysis Augmented by Artificial Intelligence: Implementation of pyBibX and a Practical Guide,” Base Electron. Educ. Sci., vol. 6, no. 1, pp. 91–103, 2025, [Online]. Available: https://www.bedujournal.com/files/19/PUB/MA/677430ee6a7f9/M8041 -234221.pdf

  27. [27]

    Generative AI and Higher Education: Trends, Challenges, and Future Directions from a Systematic Literature Review,

    J. Batista, A. Mesquita, and G. Carnaz, “Generative AI and Higher Education: Trends, Challenges, and Future Directions from a Systematic Literature Review,” Information, vol. 15, no. 11, 2024, doi: 10.3390/info15110676

  28. [28]

    Systematic analysis of generative AI tools integration in academic research and peer review,

    H. A. Salman, M. A. Ahmad, R. Ibrahim, and J. Mahmood, “Systematic analysis of generative AI tools integration in academic research and peer review,” Online J. Commun. Media Technol., vol. 15, no. 1, 2025, [Online]. Available: https://doi.org/10.30935/ojcmt/15832

  29. [29]

    The Use of Generative AI for Scientific Literature Searches for Systematic Reviews: ChatGPT and Microsoft Bing AI Performance Evaluation,

    Y. N. Gwon et al., “The Use of Generative AI for Scientific Literature Searches for Systematic Reviews: ChatGPT and Microsoft Bing AI Performance Evaluation,” JMIR Med Inf., vol. 12, p. e51187, May 2024, doi: 10.2196/51187. 20

  30. [30]

    Evaluation of Large Language Model Performance and Reliability for Citations and References in Scholarly Writing: Cross-Disciplinary Study,

    J. Mugaanyi, L. Cai, S. Cheng, C. Lu, and J. Huang, “Evaluation of Large Language Model Performance and Reliability for Citations and References in Scholarly Writing: Cross-Disciplinary Study,” J Med Internet Res, vol. 26, p. e52935, Apr. 2024, doi: 10.2196/52935

  31. [31]

    Automated Data Visualization from Natural Language via Large Language Models: An Exploratory Study,

    Y. Wu et al., “Automated Data Visualization from Natural Language via Large Language Models: An Exploratory Study,” Proc. ACM Manag. Data, vol. 2, no. 3, pp. 1–28, May 2024, doi: 10.1145/3654992

  32. [32]

    MetaInfoSci: An Integrated Web Tool for Scholarly Data Analysis,

    K. Sharmaa, P. Khurana, and Z. Uddina, “MetaInfoSci: An Integrated Web Tool for Scholarly Data Analysis,” 2025. [Online]. Available: https://arxiv.org/abs/2506.09056

  33. [33]

    Generative AI and the future of scientometrics: current topics and future questions,

    B. Lepori, J. P. Andersen, and K. Donnay, “Generative AI and the future of scientometrics: current topics and future questions,” 2025. [Online]. Available: https://arxiv.org/abs/2507.00783

  34. [34]

    Software tools for conducting bibliometric analysis in science: An up-to-date review,

    J. A. Moral-Muñoz, E. Herrera-Viedma, A. Santisteban-Espejo, and M. J. Cobo, “Software tools for conducting bibliometric analysis in science: An up-to-date review,” Prof. la Inf., vol. 29, no. 1, 2020, doi: 10.3145/epi.2020.ene.03

  35. [35]

    Exploratory Bibliometrics: Using VOSviewer as a Preliminary Research Tool,

    A. Kirby, “Exploratory Bibliometrics: Using VOSviewer as a Preliminary Research Tool,” Publications, 2023, doi: 10.3390/publications11010010

  36. [36]

    Machine learning and artificial intelligence for science, technology, innovation mapping and forecasting: Review, synthesis, and applications,

    D. Hain, R. Jurowetzki, S. Lee, and Y. Zhou, “Machine learning and artificial intelligence for science, technology, innovation mapping and forecasting: Review, synthesis, and applications,” 2023. doi: 10.1007/s11192-022-04628-8

  37. [37]

    Exploring machine learning: a scientometrics approach using bibliometrix and VOSviewer,

    D. O. Oyewola and E. G. Dada, “Exploring machine learning: a scientometrics approach using bibliometrix and VOSviewer,” SN Appl. Sci., 2022, doi: 10.1007/s42452-022-05027-7

  38. [38]

    Combining full text and bibliometric information in mapping scientific disciplines,

    P. Glenisson, W. Glänzel, F. Janssens, and B. De Moor, “Combining full text and bibliometric information in mapping scientific disciplines,” Inf. Process. Manag., 2005, doi: 10.1016/j.ipm.2005.03.021

  39. [39]

    Scientometric Full-Text Analysis of Papers Published in Remote Sensing between 2009 and 2021,

    T. Balz, “Scientometric Full-Text Analysis of Papers Published in Remote Sensing between 2009 and 2021,” 2022. doi: 10.3390/rs14174285

  40. [40]

    TechMiner: Analysis of bibliographic datasets using Python,

    J. D. Velasquez, “TechMiner: Analysis of bibliographic datasets using Python,” SoftwareX, 2023, doi: 10.1016/j.softx.2023.101457

  41. [41]

    Open reproducible scientometric research with Alexandria3k,

    D. Spinellis, “Open reproducible scientometric research with Alexandria3k,” PLoS One, vol. 18, no. 11, p. e0294946, Nov. 2023, [Online]. Available: https://doi.org/10.1371/journal.pone.0294946

  42. [42]

    Jupyter Notebooks—a publishing format for reproducible computational workflows,

    T. Kluyver et al., “Jupyter Notebooks—a publishing format for reproducible computational workflows,” in Positioning and Power in Academic Publishing: Players, Agents and Agendas - Proceedings of the 20th International Conference on Electronic Publishing, ELPUB 2016 , 2016. doi: 10.3233/978-1-61499-649-1- 87

  43. [43]

    Interweaving Multimodal Interaction with Flexible Unit Visualizations for Data Exploration,

    A. Srinivasan, B. Lee, and J. Stasko, “Interweaving Multimodal Interaction with Flexible Unit Visualizations for Data Exploration,” IEEE Trans. Vis. Comput. Graph., 2021, doi: 10.1109/TVCG.2020.2978050

  44. [44]

    How to ask what to say?: Strategies for evaluating natural language interfaces for data visualization,

    A. Srinivasan and J. Stasko, “How to ask what to say?: Strategies for evaluating natural language interfaces for data visualization,” IEEE Comput. Graph. Appl., 2020, doi: 10.1109/MCG.2020.2986902

  45. [45]

    The Why and The How: A Survey on Natural Language Interaction in Visualization,

    H. Voigt, O. Alacam, M. Meuschke, K. Lawonn, and S. Zarrieß, “The Why and The How: A Survey on Natural Language Interaction in Visualization,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , M. Carpuat, M.-C. de Marneffe, and I. V. Meza Ruiz, Eds., Seattle,...

  46. [46]

    doi:10.1109/TVCG.2022.3148007

    L. Shen et al., “Towards Natural Language Interfaces for Data Visualization: A Survey,” IEEE Trans. Vis. Comput. Graph., vol. 29, no. 6, pp. 3121–3144, Jun. 2023, doi: 10.1109/TVCG.2022.3148007

  47. [47]

    Natural Language to Code Generation in Interactive Data Science Notebooks,

    P. Yin et al., “Natural Language to Code Generation in Interactive Data Science Notebooks,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds., Toronto, Canada: Association for Computational Linguistics, Jul. 2023, pp. 126–173. doi: 10.18653/v1...

  48. [48]

    In-IDE code generation from natural language: Promise and challenges,

    F. F. Xu, B. Vasilescu, and G. Neubig, “In-IDE Code Generation from Natural Language: Promise and Challenges,” ACM Trans. Softw. Eng. Methodol., vol. 31, no. 2, Mar. 2022, doi: 10.1145/3487569

  49. [49]

    Leveraging large language models for data analysis automation,

    J. A. Jansen, A. Manukyan, N. Al Khoury, and A. Akalin, “Leveraging large language models for data analysis automation,” PLoS One, vol. 20, no. 2, pp. 1–17, 2025, doi: 10.1371/journal.pone.0317084

  50. [50]

    LLMs for science: Usage for code generation and data analysis,

    M. Nejjar, L. Zacharias, F. Stiehle, and I. Weber, “LLMs for science: Usage for code generation and data analysis,” J. Softw. Evol. Process, vol. 37, no. 1, p. e2723, 2025, doi: https://doi.org/10.1002/smr.2723

  51. [51]

    Preliminary guideline for reporting bibliometric reviews of the biomedical literature (BIBLIO): a minimum requirements,

    A. Montazeri, S. Mohammadi, P. M.Hesari, M. Ghaemi, H. Riazi, and Z. Sheikhi -Mobarakeh, “Preliminary guideline for reporting bibliometric reviews of the biomedical literature (BIBLIO): a minimum requirements,” Syst. Rev., 2023, doi: 10.1186/s13643-023-02410-2. 21

  52. [52]

    Safety leadership: A bibliometric literature review and future research directions,

    Z. Jiang, X. Zhao, Z. Wang, and K. Herbert, “Safety leadership: A bibliometric literature review and future research directions,” J. Bus. Res., 2024, doi: 10.1016/j.jbusres.2023.114437

  53. [53]

    The relation between intellectual capital and digital transformation: a bibliometric analysis,

    A. A. Yilmaz and S. E. Tuzlukaya, “The relation between intellectual capital and digital transformation: a bibliometric analysis,” Int. J. Innov. Sci., 2024, doi: 10.1108/IJIS-08-2022-0145