pith. sign in

arxiv: 1907.04557 · v1 · pith:5NAIJWPNnew · submitted 2019-07-10 · 💻 cs.SE

Identifying Algorithm Names in Code Comments

Pith reviewed 2026-05-24 23:50 UTC · model grok-4.3

classification 💻 cs.SE
keywords algorithm identificationcode commentsN-gramspart of speechopen source projectsrule based methodsoftware comments
0
0 comments X

The pith

Algorithm names in code comments can be automatically extracted using N-grams ending with the word 'algorithm' and part-of-speech patterns.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an automatic method to identify algorithm names in code comments. Developers frequently mention the algorithms they use in comments, which could supply data for machine learning tasks like generating API sequences or comments. The method extracts N-gram phrases ending with 'algorithm' and applies rules based on part-of-speech patterns to select appropriate names. Evaluation shows these rules reach precision and recall above 0.70. The rules are then used on comments from active open-source projects in seven languages to find commonly mentioned algorithms.

Core claim

The paper claims that N-grams ending with 'algorithm' combined with part-of-speech patterns produce rules that identify algorithm names in code comments with high precision and recall, allowing extraction from large comment collections in C, C++, Java, JavaScript, Python, PHP, and Ruby.

What carries the argument

N-gram words containing 'algorithm' as the final word, refined by part-of-speech patterns to form identification rules.

If this is right

  • Code comments become a source of labeled data for machine learning in software engineering.
  • Commonly used algorithms can be listed from real projects across multiple languages.
  • The approach works on a large scale without needing manual labeling for each project.
  • Similar techniques might identify other specific terms in comments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the rules generalize, they could support better training of models for code documentation.
  • Analysis of extracted names might show differences in algorithm usage by language or project type.
  • The list of names could aid in developing algorithm recommendation tools for developers.

Load-bearing premise

The assumption that N-grams ending with 'algorithm' and POS patterns will capture algorithm names reliably without many false positives or misses across varied code.

What would settle it

Finding that manual inspection of extracted names from additional projects yields precision or recall below 0.70.

Figures

Figures reproduced from arXiv: 1907.04557 by Arnon Rungsawang, Bundit Manaskasemsak, Hideaki Hata, Jakapong Klainongsuang, Kenichi Matsumoto, Pattara Leelaprute, Yusuf Sulistyo Nugroho.

Figure 1
Figure 1. Figure 1: Overview of our FLOSS in creating the rule [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
read the original abstract

For recent machine-learning-based tasks like API sequence generation, comment generation, and document generation, large amount of data is needed. When software developers implement algorithms in code, we find that they often mention algorithm names in code comments. Code annotated with such algorithm names can be valuable data sources. In this paper, we propose an automatic method of algorithm name identification. The key idea is extracting important N-gram words containing the word `algorithm' in the last. We also consider part of speech patterns to derive rules for appropriate algorithm name identification. The result of our rule evaluation produced high precision and recall values (more than 0.70). We apply our rules to extract algorithm names in a large amount of comments from active FLOSS projects written in seven programming languages, C, C++, Java, JavaScript, Python, PHP, and Ruby, and report commonly mentioned algorithm names in code comments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript describes a rule-based method for identifying algorithm names in source code comments. The approach extracts N-grams that end with the word 'algorithm' and filters them using part-of-speech patterns. The authors report that their rules achieve precision and recall values exceeding 0.70 on an evaluation set. They then apply the rules to comments from active free/libre open source software (FLOSS) projects in seven languages (C, C++, Java, JavaScript, Python, PHP, Ruby) and present commonly mentioned algorithm names.

Significance. If the evaluation methodology is sound, the work could support creation of large annotated datasets for downstream ML tasks in software engineering such as comment generation and API sequence prediction. The multi-language application to real FLOSS projects is a positive aspect of the large-scale extraction step.

major comments (3)
  1. [Evaluation] Evaluation section: The manuscript reports precision and recall >0.70 but provides no information on the size or construction of the labeled evaluation set, nor whether this set was held out from the data used to derive the N-gram and POS rules. This is load-bearing for the generalization claim across seven languages and coding styles.
  2. [Large-scale extraction] Large-scale extraction section: No manual validation, sampling, or error analysis is reported for the algorithm names extracted from the FLOSS comment corpus. Without this, the claim that the rules reliably identify algorithm names in active projects cannot be assessed.
  3. [Method and evaluation] Method and evaluation sections: The paper presents no baseline comparisons (e.g., keyword matching without POS filtering or simple frequency-based extraction) against which the contribution of the POS patterns can be measured.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'high precision and recall values (more than 0.70)' should be replaced with the exact measured values and the size of the evaluation set for clarity.
  2. The manuscript would benefit from a short related-work subsection situating the heuristic against prior NLP techniques applied to code comments.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments correctly identify areas where additional detail would strengthen the paper. We address each point below and will revise the manuscript to incorporate the requested information.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section: The manuscript reports precision and recall >0.70 but provides no information on the size or construction of the labeled evaluation set, nor whether this set was held out from the data used to derive the N-gram and POS rules. This is load-bearing for the generalization claim across seven languages and coding styles.

    Authors: We agree that the Evaluation section lacks necessary details. In the revised manuscript we will add a description of the labeled set size, its construction via manual annotation of N-grams from code comments, and explicit confirmation that the evaluation instances were held out from the development data used to formulate the N-gram and POS rules. revision: yes

  2. Referee: [Large-scale extraction] Large-scale extraction section: No manual validation, sampling, or error analysis is reported for the algorithm names extracted from the FLOSS comment corpus. Without this, the claim that the rules reliably identify algorithm names in active projects cannot be assessed.

    Authors: We acknowledge the absence of validation for the large-scale results. We will add a new subsection reporting a manual sampling and error analysis of extracted names from the FLOSS corpus (e.g., precision on a random sample of 100 instances) to support the reliability claim. revision: yes

  3. Referee: [Method and evaluation] Method and evaluation sections: The paper presents no baseline comparisons (e.g., keyword matching without POS filtering or simple frequency-based extraction) against which the contribution of the POS patterns can be measured.

    Authors: We agree that baselines would clarify the value of the POS patterns. In the revision we will add a comparison subsection evaluating a keyword-only baseline (N-grams ending in 'algorithm' without POS filtering) and a frequency-based extraction method on the same evaluation set, reporting their precision and recall. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation chain

full rationale

The paper proposes a heuristic rule set that extracts N-grams ending in the word 'algorithm' and filters them via POS patterns. These rules are evaluated on external comment data from FLOSS projects to report precision and recall above 0.70, then applied to a larger corpus across seven languages. No equations, fitted parameters, self-citations, or self-definitional steps appear in the provided description. The central claim rests on direct evaluation against held-out or external data rather than any reduction of outputs to inputs by construction, so the derivation is self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard NLP assumptions about the utility of N-grams and POS tagging for extracting technical terms from comments. No free parameters or invented entities are described in the abstract.

axioms (1)
  • domain assumption Part-of-speech taggers produce reliable tags on code comments
    The method derives identification rules from POS patterns.

pith-pipeline@v0.9.0 · 5709 in / 1166 out tokens · 45449 ms · 2026-05-24T23:50:23.866868+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

  1. [1]

    X. Gu, H. Zhang, D. Zhang, S. Kim, Deep api learning, in: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, ACM, New York, NY, USA, 2016, pp. 631–642. 8

  2. [2]

    Jiang, A

    S. Jiang, A. Armaly, C. McMillan, Automatically generating commit messages from diffs using neural machine translation, in: Proceedings of the 32Nd IEEE/ACM International Conference on Automated Software Engineering, ASE 2017, IEEE Press, Piscataway, NJ, USA, 2017, pp. 135–146

  3. [3]

    Y. Oda, H. Fudaba, G. Neubig, H. Hata, S. Sakti, T. Toda, S. Nakamura, Learning to generate pseudo-code from source code using statistical machine translation (t), in: Proceedings of the 2015 30th IEEE/ACM International Conference on Automated Soft- ware Engineering (ASE), ASE ’15, IEEE Computer Society, Washington, DC, USA, 2015, pp. 574–584

  4. [4]

    E. Wong, J. Yang, L. Tan, Autocomment: Mining question and answer sites for au- tomatic comment generation, in: Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering, ASE’13, IEEE Press, Piscataway, NJ, USA, 2013, pp. 562–567

  5. [5]

    Takata, A

    D. Takata, A. Alhefdhi, M. Rungroj, H. Hata, H. K. Dam, T. Ishio, K. Matsumoto, Catalogen: Generating catalogs of code examples collected from oss, in: 2018 IEEE Third International Workshop on Dynamic Software Documentation (DySDoc3), pp. 11–12

  6. [6]

    P. Yin, B. Deng, E. Chen, B. Vasilescu, G. Neubig, Learning to mine aligned code and natural language pairs from stack overflow, in: Proceedings of the 15th International Conference on Mining Software Repositories, MSR ’18, ACM, New York, NY, USA, 2018, pp. 476–486

  7. [7]

    D. R. Smith, M. R. Lowry, Algorithm theories and design tactics, Science of Computer Programming 14 (1990) 305 – 321

  8. [8]

    Terdchanakul, H

    P. Terdchanakul, H. Hata, P. Phannachitta, K. Matsumoto, Bug or not? bug report classification using n-gram idf, in: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 534–538

  9. [9]

    F. N. A. A. Omran, C. Treude, Choosing an nlp library for analyzing software docu- mentation: A systematic literature review and a series of experiments, in: Proceedings of the 14th International Conference on Mining Software Repositories, MSR ’17, IEEE Press, Piscataway, NJ, USA, 2017, pp. 187–197

  10. [10]

    Shirakawa, T

    M. Shirakawa, T. Hara, S. Nishio, N-gram idf: A global term weighting scheme based on information distance, in: Proceedings of the 24th International Conference on World Wide Web, WWW ’15, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 2015, pp. 960–970

  11. [11]

    Shirakawa, T

    M. Shirakawa, T. Hara, S. Nishio, Idf for word n-grams, ACM Trans. Inf. Syst. 36 (2017) 5:1–5:38

  12. [12]

    R. Ojha, Top 7 algorithms and data structures every programmer should know about, https://www.hackerearth.com/blog/algorithms/top-7-algorithms-data- structures-every-programmer-know/, 2015. 9

  13. [13]

    Quora, What are the top 10 algorithms every software engineer should know by heart?, https://www.quora.com/What-are-the-top-10-algorithms-every-software- engineer-should-know-by-heart, 2016

  14. [14]

    Ojha, Top 10 algorithms every software engineer should know by heart, https://www.freelancinggig.com/blog/2017/05/09/top-10-algorithms-every-software- engineer-know-heart/, 2017

    R. Ojha, Top 10 algorithms every software engineer should know by heart, https://www.freelancinggig.com/blog/2017/05/09/top-10-algorithms-every-software- engineer-know-heart/, 2017

  15. [15]

    Tenny, Program readability: procedures versus comments, IEEE Transactions on Software Engineering 14 (1988) 1271–1279

    T. Tenny, Program readability: procedures versus comments, IEEE Transactions on Software Engineering 14 (1988) 1271–1279

  16. [16]

    S. N. Woodfield, H. E. Dunsmore, V. Y. Shen, The effect of modularization and com- ments on program comprehension, in: Proceedings of the 5th International Conference on Software Engineering, ICSE ’81, IEEE Press, Piscataway, NJ, USA, 1981, pp. 215– 223

  17. [17]

    X. Hu, G. Li, X. Xia, D. Lo, Z. Jin, Deep code comment generation, in: Proceedings of the 26th Conference on Program Comprehension, ICPC ’18, ACM, New York, NY, USA, 2018, pp. 200–210

  18. [18]

    Steidl, B

    D. Steidl, B. Hummel, E. Juergens, Quality analysis of source code comments, in: 2013 21st International Conference on Program Comprehension (ICPC), pp. 83–92. 10