LinkAnchor: An Autonomous LLM-Based Agent for Issue-to-Commit Link Recovery
Pith reviewed 2026-05-18 22:37 UTC · model grok-4.3
The pith
LinkAnchor recovers issue-to-commit links using an LLM agent that lazily fetches only needed context instead of checking every pair.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
LinkAnchor is the first autonomous LLM-based agent designed specifically for issue-to-commit link recovery. It introduces a lazy-access architecture that allows the underlying LLM to dynamically retrieve only the most relevant contextual data, such as commits, issue comments, and code files, without exceeding token limits. This formulation replaces exhaustive pairwise evaluation with an agent-driven process that can follow temporal and parental dependencies among commits.
What carries the argument
The lazy-access architecture, which lets the LLM agent decide on demand which commits, comments, or code files to retrieve next.
If this is right
- Repositories with very long commit histories become tractable because only relevant slices are loaded.
- Cases where an issue is fixed by a sequence of related commits rather than one atomic change are handled directly.
- Computational cost drops because the agent avoids scoring every possible issue-commit pair.
- Temporal ordering and parent-child relationships among commits can influence the final linking decision.
Where Pith is reading between the lines
- The same selective-retrieval pattern could be applied to other traceability problems such as linking requirements to code.
- In practice the approach might cut the manual work developers spend maintaining links during maintenance.
- One open question is how to audit or correct the agent's retrieval decisions when they overlook useful context.
Load-bearing premise
The LLM agent can correctly decide which data sources to fetch at each step and that selective retrieval will still capture the full set of commits needed to resolve an issue.
What would settle it
A test repository where the agent's retrieval choices omit one or more commits that exhaustive pairwise checking would have identified as part of the fix chain, producing measurably lower recall.
Figures
read the original abstract
Issue-to-commit link recovery in software repositories is fundamental to software traceability and project management, yet it remains a challenging task. Prior studies show that only about 42.2% of issues on GitHub are correctly linked to their commits, highlighting the need for more effective solutions. Existing work has explored a range of ML/DL approaches, and more recently, large language models (LLMs) have been applied to this problem. However, these methods face two major limitations. First, LLMs are restricted by limited context windows and cannot simultaneously process all available data sources, such as long commit histories, extensive issue discussions, and large code repositories. Second, most approaches operate on individual issue-commit pairs, where a model independently scores the relevance of a single commit to an issue. This pairwise formulation fails to account for the complex associativity of software fixes, where an issue is often resolved by an aggregate chain of commits rather than a single atomic change. By ignoring these temporal and parental dependencies, existing methods often fail to incorporate the complete resolution logic and might misidentify intermediate commits as final fixes. Furthermore, this strategy is computationally inefficient in large repositories, as it requires exhaustively evaluating an enormous number of candidate pairs. To address these challenges, we present LinkAnchor, the first autonomous LLM-based agent designed specifically for issue-to-commit link recovery. LinkAnchor introduces a lazy-access architecture that allows the underlying LLM to dynamically retrieve only the most relevant contextual data, such as commits, issue comments, and code files, without exceeding token limits.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to present LinkAnchor as the first autonomous LLM-based agent for issue-to-commit link recovery. It identifies two key limitations in prior work: LLMs' context window restrictions preventing processing of all data sources, and the pairwise scoring approach that fails to account for aggregate chains of commits resolving issues. The proposed lazy-access architecture allows dynamic retrieval of relevant contextual data like commits, comments, and code files to overcome these issues.
Significance. Should the lazy-access architecture prove reliable in practice, the work has the potential to enhance software traceability by better capturing complex resolution logic in large repositories, leading to more accurate links than exhaustive pairwise methods. This could have practical benefits for project management. The design innovates by treating the LLM as an agent that selectively accesses data, which is a promising direction if empirically supported.
major comments (2)
- [Abstract] The claim that the lazy-access architecture enables capturing the complete resolution logic depends on the LLM agent's ability to make correct relevance decisions for retrieval. The manuscript does not specify the mechanism for deciding what to fetch next, the prompting strategy, or any guarantees against missing key intermediate or parent commits in the fix chain.
- [Abstract] No evaluation results, datasets, quantitative metrics, or comparisons with baseline methods are provided. This absence leaves the central claim that LinkAnchor effectively addresses the stated limitations without demonstrated support, as the soundness of the approach cannot be assessed.
minor comments (1)
- [Abstract] The statement 'only about 42.2% of issues on GitHub are correctly linked' would benefit from a citation to the prior study.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which highlight important aspects that require clarification and expansion. We address each major comment below and will incorporate revisions to improve the manuscript.
read point-by-point responses
-
Referee: [Abstract] The claim that the lazy-access architecture enables capturing the complete resolution logic depends on the LLM agent's ability to make correct relevance decisions for retrieval. The manuscript does not specify the mechanism for deciding what to fetch next, the prompting strategy, or any guarantees against missing key intermediate or parent commits in the fix chain.
Authors: We agree that the current manuscript presents the lazy-access architecture at a conceptual level without sufficient detail on the agent's internal decision process. In the revised version, we will expand the methodology section to describe the retrieval policy, including the criteria for selecting the next data item (e.g., semantic similarity to the issue description, temporal recency, and dependency analysis), the exact prompting templates used to elicit relevance judgments from the LLM, and iterative verification steps designed to reduce the chance of overlooking intermediate or parent commits. While we cannot offer formal guarantees given the stochastic nature of LLMs, we will report coverage statistics from our experiments and discuss mitigation strategies such as backtracking and multi-path exploration. revision: yes
-
Referee: [Abstract] No evaluation results, datasets, quantitative metrics, or comparisons with baseline methods are provided. This absence leaves the central claim that LinkAnchor effectively addresses the stated limitations without demonstrated support, as the soundness of the approach cannot be assessed.
Authors: The initial manuscript emphasizes the architectural contribution and the motivation for moving beyond pairwise scoring and context-window constraints. We acknowledge that this leaves the practical effectiveness unverified in the submitted version. We will add a full evaluation section that specifies the datasets (curated GitHub repositories with ground-truth links), the metrics (precision, recall, F1, and chain-coverage rate), and direct comparisons against both traditional ML baselines and recent LLM-based pairwise methods. Results will demonstrate improved accuracy on complex multi-commit resolutions while maintaining computational efficiency. revision: yes
Circularity Check
No circularity: design proposal is self-contained
full rationale
The paper describes a new LLM agent architecture (lazy-access for dynamic retrieval) to address context limits and pairwise limitations in issue-to-commit recovery. No equations, fitted parameters, predictions, or derivations are present that could reduce to inputs by construction. The abstract and provided text introduce the system as a novel contribution without self-definitional loops, self-citation load-bearing on uniqueness theorems, or renaming of known results. This is a standard engineering/design paper whose claims rest on the proposed mechanism rather than any reduction to prior fitted quantities or self-referential definitions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLMs can make reliable decisions about which repository artifacts to retrieve next given partial context.
invented entities (1)
-
lazy-access architecture
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
LinkAnchor introduces a lazy-access architecture that allows the underlying LLM to dynamically retrieve only the most relevant contextual data... without exceeding token limits.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
LinkAnchor formulates ILR as a search problem for the LLM, eliminating the need to exhaustively assess the relevance of every commit
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
LinkAnchor’s replication package
“LinkAnchor’s replication package.” https://github.com/ISE-Research/ LinkAnchor. Accessed: 08-30-2025
work page 2025
-
[2]
Improving the effectiveness of traceability link recovery using hierarchical bayesian networks,
K. Moran, D. N. Palacio, C. Bernal-C ´ardenas, D. McCrystal, D. Poshy- vanyk, C. Shenefiel, and J. Johnson, “Improving the effectiveness of traceability link recovery using hierarchical bayesian networks,” in Pro- ceedings of the ACM/IEEE 42nd International Conference on Software Engineering, pp. 873–885, 2020
work page 2020
-
[3]
Leveraging in- termediate artifacts to improve automated trace link retrieval,
A. D. Rodriguez, J. Cleland-Huang, and D. Falessi, “Leveraging in- termediate artifacts to improve automated trace link retrieval,” in 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 81–92, IEEE, 2021
work page 2021
-
[4]
Lissa: Toward generic traceability link recovery through retrieval- augmented generation,
D. Fuchß, T. Hey, J. Keim, H. Liu, N. Ewald, T. Thirolf, and A. Koziolek, “Lissa: Toward generic traceability link recovery through retrieval- augmented generation,” in Proceedings of the IEEE/ACM 47th Inter- national Conference on Software Engineering. ICSE , vol. 25, 2025
work page 2025
-
[5]
Aroma: Automatic reproduction of maven artifacts,
M. Keshani, T.-G. Velican, G. Bot, and S. Proksch, “Aroma: Automatic reproduction of maven artifacts,” Proceedings of the ACM on Software Engineering, vol. 1, no. FSE, pp. 836–858, 2024
work page 2024
-
[6]
Stack overflow developer survey 2022: Ver- sion control systems
Stack Overflow, “Stack overflow developer survey 2022: Ver- sion control systems.” Online at https://survey.stackoverflow.co/2022/ #section-version-control-version-control-systems, 2022. Accessed: 26 March 2025
work page 2022
-
[7]
Deeplink: Recovering issue- commit links based on deep learning,
H. Ruan, B. Chen, X. Peng, and W. Zhao, “Deeplink: Recovering issue- commit links based on deep learning,” Journal of Systems and Software, vol. 158, p. 110406, 2019
work page 2019
-
[8]
Rclinker: Automated linking of issue reports and commits leveraging rich con- textual information,
T.-D. B. Le, M. Linares-V ´asquez, D. Lo, and D. Poshyvanyk, “Rclinker: Automated linking of issue reports and commits leveraging rich con- textual information,” in 2015 IEEE 23rd international conference on program comprehension, pp. 36–47, IEEE, 2015
work page 2015
-
[9]
Feature location in source code: a taxonomy and survey,
B. Dit, M. Revelle, M. Gethers, and D. Poshyvanyk, “Feature location in source code: a taxonomy and survey,” Journal of software: Evolution and Process, vol. 25, no. 1, pp. 53–95, 2013
work page 2013
-
[10]
J. Anvik, L. Hiew, and G. C. Murphy, “Who should fix this bug?,” in Proceedings of the 28th international conference on Software engineer- ing, pp. 361–370, 2006
work page 2006
-
[11]
A literature review of automatic traceability links recovery for software change impact analysis,
T. W. W. Aung, H. Huo, and Y . Sui, “A literature review of automatic traceability links recovery for software change impact analysis,” in Proceedings of the 28th International Conference on Program Com- prehension, pp. 14–24, 2020
work page 2020
-
[12]
M. C. Panis, “Successful deployment of requirements traceability in a commercial engineering organization... really,” in 2010 18th IEEE In- ternational Requirements Engineering Conference , pp. 303–307, IEEE, 2010
work page 2010
-
[13]
Frlink: Improving the recovery of missing issue-commit links by revisiting file relevance,
Y . Sun, Q. Wang, and Y . Yang, “Frlink: Improving the recovery of missing issue-commit links by revisiting file relevance,”Information and Software Technology, vol. 84, pp. 33–47, 2017
work page 2017
-
[14]
Automated recovery of issue-commit links leveraging both textual and non-textual data,
P. R. Mazrae, M. Izadi, and A. Heydarnoori, “Automated recovery of issue-commit links leveraging both textual and non-textual data,” in2021 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 263–273, IEEE, 2021
work page 2021
-
[15]
Multi- layered approach for recovering links between bug reports and fixes,
A. T. Nguyen, T. T. Nguyen, H. A. Nguyen, and T. N. Nguyen, “Multi- layered approach for recovering links between bug reports and fixes,” in Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering , pp. 1–11, 2012
work page 2012
-
[16]
Improving missing issue- commit link recovery using positive and unlabeled data,
Y . Sun, C. Chen, Q. Wang, and B. Boehm, “Improving missing issue- commit link recovery using positive and unlabeled data,” in 2017 32nd IEEE/ACM International Conference on Automated Software Engineer- ing (ASE), pp. 147–152, IEEE, 2017
work page 2017
-
[17]
Deeplink: A code knowledge graph based deep learning approach for issue-commit link recovery,
R. Xie, L. Chen, W. Ye, Z. Li, T. Hu, D. Du, and S. Zhang, “Deeplink: A code knowledge graph based deep learning approach for issue-commit link recovery,” in2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER) , pp. 434–444, IEEE, 2019
work page 2019
-
[18]
Y . Deng, B. Wang, Z. Zou, and L. Ye, “Promptlink: Multi-template prompt learning with adversarial training for issue-commit link recov- ery,” inProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement , pp. 461–467, 2024
work page 2024
-
[19]
Mplinker: Multi- template prompt-tuning with adversarial training for issue-commit link recovery,
B. Wang, Y . Deng, R. Luo, P. Liang, and T. Bi, “Mplinker: Multi- template prompt-tuning with adversarial training for issue-commit link recovery,” Journal of Systems and Software , p. 112351, 2025
work page 2025
-
[20]
Btlink: automatic link recovery between issues and commits based on pre-trained bert model,
J. Lan, L. Gong, J. Zhang, and H. Zhang, “Btlink: automatic link recovery between issues and commits based on pre-trained bert model,” Empirical Software Engineering , vol. 28, no. 4, p. 103, 2023
work page 2023
-
[21]
Traceability transformed: Generating more accurate links with pre-trained bert mod- els,
J. Lin, Y . Liu, Q. Zeng, M. Jiang, and J. Cleland-Huang, “Traceability transformed: Generating more accurate links with pre-trained bert mod- els,” in 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 324–335, IEEE, 2021
work page 2021
-
[22]
Bert: Pre-training of deep bidirectional transformers for language understanding,
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Pro- ceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers) , pp. 4171–4186, 2019
work page 2019
-
[23]
Ealink: An efficient and accurate pre-trained framework for issue-commit link recovery,
C. Zhang, Y . Wang, Z. Wei, Y . Xu, J. Wang, H. Li, and R. Ji, “Ealink: An efficient and accurate pre-trained framework for issue-commit link recovery,” in 2023 38th IEEE/ACM International Conference on Auto- mated Software Engineering (ASE) , pp. 217–229, IEEE, 2023
work page 2023
-
[24]
L. Dong, H. Zhang, W. Liu, Z. Weng, and H. Kuang, “Semi-supervised pre-processing for learning-based traceability framework on real-world software projects,” in Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering , pp. 570–582, 2022
work page 2022
-
[25]
Tree-sitter: An incremental parsing system for programming tools
M. Brunsfeld and G. contributors, “Tree-sitter: An incremental parsing system for programming tools.” GitHub repository at https://github.com/ tree-sitter/tree-sitter, 2018. Accessed: 22 March 2025
work page 2018
-
[26]
The initial incomplete commit in the first motivating example
“The initial incomplete commit in the first motivating example.” at https://github.com/JuliaLang/julia/commit/ 58079819d667f832abfc6fea8210252f161387d7
-
[27]
The issue in the first motivating example
“The issue in the first motivating example.” at https://github.com/ JuliaLang/julia/issues/92
-
[28]
The final fix commit in the first motivating ex- ample
“The final fix commit in the first motivating ex- ample.” at https://github.com/JuliaLang/julia/commit/ ec4f20243c0807654eeeb8df343611d22c8ef404
-
[29]
Issue discussion example with explicit commit reference
“Issue discussion example with explicit commit reference.” at https:// issues.apache.org/jira/browse/CALCITE-6820
-
[30]
Estimating the number of remaining links in traceability recovery,
D. Falessi, M. Di Penta, G. Canfora, and G. Cantone, “Estimating the number of remaining links in traceability recovery,” Empirical Software Engineering, vol. 22, pp. 996–1027, 2017
work page 2017
-
[31]
Automatic traceability main- tenance via machine learning classification,
C. Mills, J. Escobar-Avila, and S. Haiduc, “Automatic traceability main- tenance via machine learning classification,” in 2018 IEEE international conference on software maintenance and evolution (ICSME) , pp. 369– 380, IEEE, 2018
work page 2018
-
[32]
Advancing candidate link generation for requirements tracing: The study of methods,
J. H. Hayes, A. Dekhtyar, and S. K. Sundaram, “Advancing candidate link generation for requirements tracing: The study of methods,” IEEE Transactions on Software Engineering , vol. 32, no. 1, pp. 4–19, 2006
work page 2006
-
[33]
Recovering traceability links between code and documentation,
G. Antoniol, G. Canfora, G. Casazza, A. De Lucia, and E. Merlo, “Recovering traceability links between code and documentation,” IEEE transactions on software engineering, vol. 28, no. 10, pp. 970–983, 2002
work page 2002
-
[34]
On inte- grating orthogonal information retrieval methods to improve traceability recovery,
M. Gethers, R. Oliveto, D. Poshyvanyk, and A. De Lucia, “On inte- grating orthogonal information retrieval methods to improve traceability recovery,” in 2011 27th IEEE International Conference on Software Maintenance (ICSM), pp. 133–142, IEEE, 2011
work page 2011
-
[35]
On the role of semantics in automated requirements tracing,
A. Mahmoud and N. Niu, “On the role of semantics in automated requirements tracing,” Requirements Engineering, vol. 20, pp. 281–300, 2015
work page 2015
-
[36]
Towards feature-aware retrieval of refinement traces,
P. Rempel, P. M ¨ader, and T. Kuschke, “Towards feature-aware retrieval of refinement traces,” in 2013 7th International Workshop on Traceabil- ity in Emerging Forms of Software Engineering (TEFSE) , pp. 100–104, IEEE, 2013
work page 2013
-
[37]
Enhancing an artefact management system with traceability recovery features,
A. De Lucia, F. Fasano, R. Oliveto, and G. Tortora, “Enhancing an artefact management system with traceability recovery features,” in 20th IEEE International Conference on Software Maintenance, 2004. Proceedings., pp. 306–315, IEEE, 2004
work page 2004
-
[38]
Recovering documentation-to-source-code traceability links using latent semantic indexing,
A. Marcus and J. I. Maletic, “Recovering documentation-to-source-code traceability links using latent semantic indexing,” in 25th International Conference on Software Engineering, 2003. Proceedings. , pp. 125–135, IEEE, 2003
work page 2003
-
[39]
Technique integration for requirements assessment,
A. Dekhtyar, J. H. Hayes, S. Sundaram, A. Holbrook, and O. Dekhtyar, “Technique integration for requirements assessment,” in15th IEEE Inter- national Requirements Engineering Conference (RE 2007), pp. 141–150, IEEE, 2007
work page 2007
-
[40]
Software trace- ability with topic modeling,
H. U. Asuncion, A. U. Asuncion, and R. N. Taylor, “Software trace- ability with topic modeling,” in Proceedings of the 32nd ACM/IEEE international conference on Software Engineering-Volume 1 , pp. 95– 104, 2010
work page 2010
-
[41]
H. Gao, H. Kuang, K. Sun, X. Ma, A. Egyed, P. M ¨ader, G. Rong, D. Shao, and H. Zhang, “Using consensual biterms from text structures of requirements and code to improve ir-based traceability recovery,” in Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering , pp. 1–1, 2022
work page 2022
-
[42]
Improving traceability link recovery using fine-grained requirements-to-code relations,
T. Hey, F. Chen, S. Weigelt, and W. F. Tichy, “Improving traceability link recovery using fine-grained requirements-to-code relations,” in2021 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 12–22, IEEE, 2021
work page 2021
-
[43]
Supporting requirements traceability through refactoring,
A. Mahmoud and N. Niu, “Supporting requirements traceability through refactoring,” in 2013 21st IEEE International Requirements Engineering Conference (RE), pp. 32–41, IEEE, 2013
work page 2013
-
[44]
H. Kuang, H. Gao, H. Hu, X. Ma, J. L ¨u, P. M ¨ader, and A. Egyed, “Using frugal user feedback with closeness analysis on code to improve ir-based traceability recovery,” in 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC) , pp. 369–379, IEEE, 2019
work page 2019
-
[45]
An ir-based artificial bee colony approach for traceability link recovery,
D. V . Rodriguez and D. L. Carver, “An ir-based artificial bee colony approach for traceability link recovery,” in2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI), pp. 1145–1153, IEEE, 2020
work page 2020
-
[46]
D. V . Rodriguez and D. L. Carver, “Multi-objective information retrieval- based nsga-ii optimization for requirements traceability recovery,” in 2020 IEEE International Conference on Electro Information Technology (EIT), pp. 271–280, IEEE, 2020
work page 2020
-
[47]
Semantically enhanced soft- ware traceability using deep learning techniques,
J. Guo, J. Cheng, and J. Cleland-Huang, “Semantically enhanced soft- ware traceability using deep learning techniques,” in 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE) , pp. 3– 14, IEEE, 2017
work page 2017
-
[48]
Recovering trace links between software documentation and code,
J. Keim, S. Corallo, D. Fuchß, T. Hey, T. Telge, and A. Koziolek, “Recovering trace links between software documentation and code,” in Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, pp. 1–13, 2024
work page 2024
-
[49]
Recovering transitive traceability links among software artifacts,
K. Nishikawa, H. Washizaki, Y . Fukazawa, K. Oshima, and R. Mibe, “Recovering transitive traceability links among software artifacts,” in 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 576–580, IEEE, 2015
work page 2015
-
[50]
Prompts matter: Insights and strategies for prompt engineering in automated software traceability,
A. D. Rodriguez, K. R. Dearstyne, and J. Cleland-Huang, “Prompts matter: Insights and strategies for prompt engineering in automated software traceability,” in 2023 IEEE 31st International Requirements Engineering Conference Workshops (REW) , pp. 455–464, IEEE, 2023
work page 2023
-
[51]
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, et al., “Codebert: A pre-trained model for programming and natural languages,” arXiv preprint arXiv:2002.08155 , 2020
work page internal anchor Pith review Pith/arXiv arXiv 2002
-
[52]
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Y . Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V . Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692 , 2019
work page internal anchor Pith review Pith/arXiv arXiv 1907
-
[53]
A quantitative and qualitative evaluation of llm-based explainable fault localization,
S. Kang, G. An, and S. Yoo, “A quantitative and qualitative evaluation of llm-based explainable fault localization,” Proceedings of the ACM on Software Engineering, vol. 1, no. FSE, pp. 1424–1446, 2024
work page 2024
-
[54]
RepairAgent: An Autonomous, LLM-Based Agent for Program Repair
I. Bouzenia, P. Devanbu, and M. Pradel, “Repairagent: An autonomous, llm-based agent for program repair,” arXiv preprint arXiv:2403.17134 , 2024
work page internal anchor Pith review arXiv 2024
-
[55]
You name it, i run it: An llm agent to execute tests of arbitrary projects,
I. Bouzenia and M. Pradel, “You name it, i run it: An llm agent to execute tests of arbitrary projects,” arXiv preprint arXiv:2412.10133 , 2024
-
[56]
OpenAI, “Function calling.” Online at https://platform.openai.com/docs/ guides/function-calling, 2024. Accessed: 22 March 2025. Aim for fewer than 20 functions at any one time, though this is just a soft suggestion
work page 2024
-
[57]
CodeSearchNet Challenge: Evaluating the State of Semantic Code Search
H. Husain, H.-H. Wu, T. Gazit, M. Allamanis, and M. Brockschmidt, “Codesearchnet challenge: Evaluating the state of semantic code search,” arXiv preprint arXiv:1909.09436 , 2019
work page internal anchor Pith review Pith/arXiv arXiv 1909
-
[58]
Opportunities and challenges in code search tools,
C. Liu, X. Xia, D. Lo, C. Gao, X. Yang, and J. Grundy, “Opportunities and challenges in code search tools,” ACM Computing Surveys (CSUR), vol. 54, no. 9, pp. 1–40, 2021
work page 2021
-
[59]
OpenAI, “Gpt-4o model description.” athttps://platform.openai.com/ docs/models/gpt-4o, 2025
work page 2025
-
[60]
Openai’s gpt-4o - ai model details,
D. AI, “Openai’s gpt-4o - ai model details,” 2024. Accessed: 2025-05- 28
work page 2024
-
[61]
OpenAI, “Tokens.” at https://platform.openai.com/docs/concepts/tokens, 2025
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.