Mercem: Method Name Recommendation Based on Call Graph Embedding
Pith reviewed 2026-05-24 22:33 UTC · model grok-4.3
The pith
Embedding the method call graph yields more appropriate name suggestions than prior techniques in difficult cases.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that applying graph embedding techniques to the method call graph produces more appropriate method name candidates than the state-of-the-art approach, especially in difficult situations where local features alone are insufficient.
What carries the argument
Graph embedding of the method call graph, which encodes structural relationships among methods to infer naming intent.
If this is right
- Name recommendation becomes usable for methods whose behavior is defined more by their position in the call structure than by their internal statements.
- The same embedding step can rank candidate names by how well they match observed graph neighborhoods.
- Developers receive suggestions even when a method has few or no statements yet written.
- Existing name recommenders can be augmented by adding call-graph context without replacing their local analysis.
Where Pith is reading between the lines
- The same embedding could be applied to recommend names for other identifiers such as variables or classes once their dependency graphs are constructed.
- Integration into an IDE would allow real-time suggestions that update as the call graph changes during editing.
- If call-graph structure proves predictive, similar embeddings might help with related tasks such as detecting misplaced methods or suggesting refactorings.
Load-bearing premise
Structural patterns visible in the call graph alone carry enough information about a method's intended purpose to guide accurate name recommendations.
What would settle it
A direct comparison on the same difficult cases where the embedding method's name suggestions receive lower appropriateness scores than the prior technique.
Figures
read the original abstract
Comprehensibility of source code is strongly affected by identifier names, therefore software developers need to give good (e.g. meaningful but short) names to identifiers. On the other hand, giving a good name is sometimes a difficult and time-consuming task even for experienced developers. To support naming identifiers, several techniques for recommending identifier name candidates have been proposed. These techniques, however, still have challenges on the goodness of suggested candidates and limitations on applicable situations. This paper proposes a new approach to recommending method names by applying graph embedding techniques to the method call graph. The evaluation experiment confirms that the proposed technique can suggest more appropriate method name candidates in difficult situations than the state of the art approach.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Mercem, a technique for recommending method names that applies graph embedding to the method call graph. It claims that an evaluation experiment demonstrates superiority over the state-of-the-art approach specifically in difficult situations for suggesting appropriate method name candidates.
Significance. If the evaluation substantiates the superiority claim and the structural embeddings reliably capture semantic intent, the work would introduce a graph-based structural approach to identifier naming that could complement existing techniques and improve code comprehensibility. The paper's focus on 'difficult situations' is a targeted contribution, but the central assumption that call-graph proximity equates to semantic proximity requires explicit validation to establish impact.
major comments (2)
- [Evaluation] Evaluation section (details referenced in abstract): The abstract states that an evaluation experiment confirms superiority but supplies no information on dataset size, chosen metrics, baselines compared, statistical significance tests, or how 'difficult situations' were isolated and measured. This omission leaves the central claim of measurable improvement unsupported and load-bearing for the paper's contribution.
- [Approach and Evaluation] §3 (Approach) and §4 (Evaluation): The claim that call-graph embeddings alone suffice for semantic name recommendation in difficult cases is not accompanied by an analysis or ablation that tests the failure mode where isomorphic call-graph positions correspond to unrelated domain roles. Without such a test, the superiority result cannot be distinguished from cases where structural proximity fails to encode intent.
minor comments (2)
- [Abstract] The abstract and introduction should explicitly define the metrics used to judge 'more appropriate' name candidates and how the SOTA baseline was implemented for fair comparison.
- [Approach] Notation for the graph embedding (e.g., how nodes and edges are represented) should be introduced with a small example to improve readability before the evaluation results.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We respond to each major point below and indicate planned revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Evaluation] Evaluation section (details referenced in abstract): The abstract states that an evaluation experiment confirms superiority but supplies no information on dataset size, chosen metrics, baselines compared, statistical significance tests, or how 'difficult situations' were isolated and measured. This omission leaves the central claim of measurable improvement unsupported and load-bearing for the paper's contribution.
Authors: The full evaluation details appear in Section 4, covering the dataset of call graphs extracted from open-source Java projects, metrics including top-k accuracy, the state-of-the-art baselines, and the definition of difficult situations as naming cases where prior techniques yield low similarity scores. Statistical significance is assessed with appropriate tests such as Wilcoxon signed-rank. We agree the abstract is overly terse on these points and will revise it to briefly summarize the evaluation scale, metrics, and key findings. revision: yes
-
Referee: [Approach and Evaluation] §3 (Approach) and §4 (Evaluation): The claim that call-graph embeddings alone suffice for semantic name recommendation in difficult cases is not accompanied by an analysis or ablation that tests the failure mode where isomorphic call-graph positions correspond to unrelated domain roles. Without such a test, the superiority result cannot be distinguished from cases where structural proximity fails to encode intent.
Authors: The referee correctly identifies a potential gap in validating the core assumption. Our empirical results demonstrate that the embeddings improve recommendations precisely in difficult cases, providing indirect support for the approach. To directly address the concern, we will add a discussion paragraph examining the assumption, noting observed cases from the evaluation where structure and semantics align, and acknowledging scenarios where they may diverge. revision: yes
Circularity Check
No circularity; empirical ML approach with independent evaluation
full rationale
The paper describes an applied technique that constructs call-graph embeddings and evaluates name-recommendation quality against baselines on held-out data. No equations, derivations, or first-principles claims appear in the provided text. The central result is an experimental comparison, not a reduction of any output to a fitted parameter or self-citation chain. The assumption that structural proximity encodes semantic intent is a modeling choice subject to external falsification, not a definitional tautology.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Program comprehension during software maintenance and evolution,
A. V on Mayrhauser and A. M. Vans, “Program comprehension during software maintenance and evolution,” Computer, vol. 28, no. 8, pp. 44– 55, 1995
work page 1995
-
[2]
Manifesto for agile software development,
K. Beck, M. Beedle, A. van Bennekum, A. Cockburn, W. Cunningham, M. Fowler, J. Grenning, J. Highsmith, A. Hunt, R. Jeffries, J. Kern, B. Marick, R. C. Martin, S. Mellor, K. Schwaber, J. Sutherland, and D. Thomas, “Manifesto for agile software development,” 2001. [Online]. Available: http://www.agilemanifesto.org/
work page 2001
-
[3]
The emergent structure of development tasks,
G. C. Murphy, M. Kersten, M. P. Robillard, and D. Cubranic, “The emergent structure of development tasks,” in ECOOP, vol. 5. Springer, 2005, pp. 33–48
work page 2005
-
[4]
Characteristics of application software maintenance,
B. P. Lientz, E. B. Swanson, and G. E. Tompkins, “Characteristics of application software maintenance,” Communications of the ACM , vol. 21, no. 6, pp. 466–471, 1978
work page 1978
-
[5]
What’s in a name? a study of identifiers,
D. Lawrie, C. Morrell, H. Feild, and D. Binkley, “What’s in a name? a study of identifiers,” in Proceedings of the 14th IEEE International Conference on Program Comprehension , ser. ICPC ’06. Washington, DC, USA: IEEE Computer Society, 2006, pp. 3–12. [Online]. Available: https://doi.org/10.1109/ICPC.2006.51
-
[6]
R. C. Martin, Clean Code: A Handbook of Agile Software Craftsman- ship, 1st ed. Upper Saddle River, NJ, USA: Prentice Hall PTR, 2008, ch. Chapter 2: Meaningful Names
work page 2008
-
[7]
D. Boswell and T. Foucher, The Art of Readable Code: Simple and Practical Techniques for Writing Better Code . O’Reilly, 2012, ch. Chapter Two: Packing Information into Names
work page 2012
-
[8]
Recommending verbs for rename method using association rule mining,
Y . Kashiwabara, Y . Onizuka, T. Ishio, Y . Hayase, T. Yamamoto, and K. Inoue, “Recommending verbs for rename method using association rule mining,” in Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE), 2014 Software Evolution Week-IEEE Con- ference on. IEEE, 2014, pp. 323–327
work page 2014
-
[9]
Suggesting accurate method and class names,
M. Allamanis, E. T. Barr, C. Bird, and C. Sutton, “Suggesting accurate method and class names,” in Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering . ACM, 2015, pp. 38–49
work page 2015
-
[10]
Linguistic regularities in con- tinuous space word representations
T. Mikolov, W.-t. Yih, and G. Zweig, “Linguistic regularities in con- tinuous space word representations.” in hlt-Naacl, vol. 13, 2013, pp. 746–751
work page 2013
-
[11]
A convolutional attention network for extreme summarization of source code,
M. Allamanis, H. Peng, and C. Sutton, “A convolutional attention network for extreme summarization of source code,” in International Conference on Machine Learning (ICML) , 2016
work page 2016
-
[12]
Parallel distributed processing: Explorations in the microstructure of cognition, vol. 1,
G. E. Hinton, J. L. McClelland, and D. E. Rumelhart, “Parallel distributed processing: Explorations in the microstructure of cognition, vol. 1,” D. E. Rumelhart, J. L. McClelland, and C. PDP Research Group, Eds. Cambridge, MA, USA: MIT Press, 1986, ch. Distributed Representations, pp. 77–109. [Online]. Available: http://dl.acm.org/ citation.cfm?id=104279.104287
-
[13]
Efficient Estimation of Word Representations in Vector Space
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781 , 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[14]
Distributed representations of words and phrases and their composi- tionality,
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their composi- tionality,” in Advances in neural information processing systems , 2013, pp. 3111–3119
work page 2013
-
[15]
Z. S. Harris, “Distributional structure,” Word, vol. 10, no. 2-3, pp. 146– 162, 1954
work page 1954
-
[16]
A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications
H. Cai, V . W. Zheng, and K. C. Chang, “A comprehensive survey of graph embedding: Problems, techniques and applications,” CoRR, vol. abs/1709.07604, 2017. [Online]. Available: http://arxiv.org/abs/1709. 07604
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[17]
Deepwalk: Online learning of social representations,
B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning of social representations,” in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , ser. KDD ’14. New York, NY , USA: ACM, 2014, pp. 701–710. [Online]. Available: http://doi.acm.org/10.1145/2623330.2623732
-
[18]
Node2vec: Scalable feature learning for networks,
A. Grover and J. Leskovec, “Node2vec: Scalable feature learning for networks,” in Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , ser. KDD ’16. New York, NY , USA: ACM, 2016, pp. 855–864. [Online]. Available: http://doi.acm.org/10.1145/2939672.2939754
-
[19]
The programmer’s lexicon, volume i: The verbs,
E. W. Host and B. M. Ostvold, “The programmer’s lexicon, volume i: The verbs,” in Source Code Analysis and Manipulation, 2007. SCAM
work page 2007
-
[20]
Seventh IEEE International Working Conference on. IEEE, 2007, pp. 193–202
work page 2007
-
[21]
E. W. Høst and B. M. Østvold, “Debugging method names,” in European Conference on Object-Oriented Programming. Springer, 2009, pp. 294– 317
work page 2009
-
[22]
Constructing the call graph of a program,
B. G. Ryder, “Constructing the call graph of a program,” IEEE Trans. Softw. Eng., vol. 5, no. 3, pp. 216–226, May 1979. [Online]. Available: http://dx.doi.org/10.1109/TSE.1979.234183
-
[23]
Extending and evaluating flow-insenstitive and context-insensitive points-to analyses for java,
D. Liang, M. Pennings, and M. J. Harrold, “Extending and evaluating flow-insenstitive and context-insensitive points-to analyses for java,” in Proceedings of the 2001 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering , ser. PASTE ’01. New York, NY , USA: ACM, 2001, pp. 73–79. [Online]. Available: http://doi.acm.org/10.1145...
-
[24]
Y . Smaragdakis and G. Balatsouras, “Pointer analysis,” Found. Trends Program. Lang., vol. 2, no. 1, pp. 1–69, Apr. 2015. [Online]. Available: http://dx.doi.org/10.1561/2500000014
-
[25]
Part-of-speech tagging of program identifiers for improved text-based software engi- neering tools,
S. Gupta, S. Malik, L. Pollock, and K. Vijay-Shanker, “Part-of-speech tagging of program identifiers for improved text-based software engi- neering tools,” in Program Comprehension (ICPC), 2013 IEEE 21st International Conference on . IEEE, 2013, pp. 3–12
work page 2013
-
[26]
Repository for the code of the
“Repository for the code of the ”a convolutional attention network for extreme summarization of source code” paper,” https://github.com/ mast-group/convolutional-attention/
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.