Recognition: no theorem link
Finding Memory Leaks in C/C++ Programs via Neuro-Symbolic Augmented Static Analysis
Pith reviewed 2026-05-14 22:30 UTC · model grok-4.3
The pith
MemHint augments static analyzers with LLMs and Z3 to detect 52 memory leaks in 3.4 million lines of C/C++ code.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MemHint parses a target codebase, applies an LLM to label each function as allocator, deallocator or neither while producing ownership summaries, discards any summary whose claimed operation cannot occur on a feasible path according to Z3, injects the surviving summaries into CodeQL and Infer, uses Z3 to drop warnings on infeasible paths, and runs a final LLM check to retain only genuine bugs.
What carries the argument
Neuro-symbolic pipeline that uses LLM classification of custom memory functions to produce ownership summaries and Z3 to validate reachability and filter infeasible paths.
Load-bearing premise
The LLM correctly identifies which functions perform memory operations and what carries ownership, and Z3 accurately determines which paths are feasible.
What would settle it
A codebase containing many project-specific allocators where the LLM misclassifies them would cause MemHint to report no more leaks than unaugmented CodeQL or Infer.
Figures
read the original abstract
Memory leaks remain prevalent in real-world C/C++ software. Static analyzers such as CodeQL provide scalable program analysis but frequently miss such bugs because they cannot recognize project-specific custom memory-management functions and lack path-sensitive control-flow modeling. We present MemHint, a neuro-symbolic pipeline that addresses both limitations by combining LLMs' semantic understanding of code with Z3-based symbolic reasoning. MemHint parses the target codebase and applies an LLM to classify each function as a memory allocator, deallocator, or neither, producing function summaries that record which argument or return value carries memory ownership, extending the analyzer's built-in knowledge beyond standard primitives such as malloc and free. A Z3-based validation step checks each summary against the function's control-flow graph, discarding those whose claimed memory operation is unreachable on any feasible path. The validated summaries are injected into CodeQL and Infer via their respective extension mechanisms. Z3 path feasibility filtering then eliminates warnings on infeasible paths, and a final LLM-based validation step confirms whether each remaining warning is a genuine bug. On seven real-world C/C++ projects totaling over 3.4M lines of code, MemHint detects 52 unique memory leaks (49 confirmed/fixed, 4 CVEs submitted) at approximately $1.7 per detected bug, compared to 19 by vanilla CodeQL and 3 by vanilla Infer.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces MemHint, a neuro-symbolic pipeline that uses an LLM to classify custom C/C++ memory-management functions (allocators/deallocators) and produce ownership summaries, validates them with Z3 reachability checks on the CFG, injects the summaries into CodeQL and Infer, applies Z3 path-feasibility filtering, and uses a final LLM step to confirm warnings. On seven real-world projects (>3.4 MLOC) it reports 52 unique leaks (49 confirmed/fixed, 4 CVEs) versus 19 for vanilla CodeQL and 3 for vanilla Infer, at roughly $1.7 per detected bug.
Significance. If the LLM classifications prove reliable, the work shows a practical, low-cost route to extending scalable static analyzers to project-specific APIs by combining semantic understanding with symbolic validation. The scale of the evaluation corpus, the concrete bug counts, independent confirmations, and cost metric are concrete strengths that would be of interest to the static-analysis and software-engineering communities.
major comments (3)
- [§3] §3 (LLM-based classification pipeline): No precision, recall, or sampled manual-audit numbers are reported for the LLM's labeling of custom allocators, deallocators, and ownership semantics on the 3.4 MLOC corpus. Because the generated summaries are injected directly into CodeQL and Infer, any systematic misclassification would inflate the reported 52 detections; the Z3 step only checks reachability of the claimed operation, not semantic correctness of the label.
- [§4] §4 (Experimental evaluation): The comparison with vanilla CodeQL and Infer lacks an ablation that removes either the LLM summaries or the Z3 filtering steps, so the individual contribution of each component to the jump from 19/3 to 52 leaks cannot be quantified. In addition, no false-negative analysis or sampling of missed leaks is provided.
- [§4] §4 (Bug confirmation): The claim that 49 of the 52 leaks were independently confirmed/fixed is load-bearing for the central improvement claim, yet the manuscript gives no protocol for the confirmation process, inter-annotator agreement, or how false positives were ruled out.
minor comments (3)
- [§4] The cost figure of approximately $1.7 per bug should be accompanied by an explicit breakdown (LLM API calls, Z3 queries, etc.) in the evaluation section.
- Clarify the exact extension mechanisms used to inject the validated summaries into CodeQL and Infer (e.g., which predicates or taint rules are overridden).
- A small number of typos and inconsistent capitalization appear in the abstract and section headings; a light copy-edit pass is recommended.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below and indicate the planned revisions.
read point-by-point responses
-
Referee: [§3] §3 (LLM-based classification pipeline): No precision, recall, or sampled manual-audit numbers are reported for the LLM's labeling of custom allocators, deallocators, and ownership semantics on the 3.4 MLOC corpus. Because the generated summaries are injected directly into CodeQL and Infer, any systematic misclassification would inflate the reported 52 detections; the Z3 step only checks reachability of the claimed operation, not semantic correctness of the label.
Authors: We agree that quantitative metrics for the LLM classification step are missing and would strengthen the evaluation. In the revision we will add a manual audit of a random sample of 100 functions drawn from the corpus, reporting precision, recall, and F1 scores separately for allocator/deallocator identification and for ownership-summary accuracy. While the Z3 reachability check discards summaries whose claimed operations are unreachable, we acknowledge it does not verify semantic correctness; the added audit will quantify any misclassification rate. The high rate of independently confirmed bugs (49/52) supplies supporting end-to-end evidence, but we will make the classification reliability explicit. revision: yes
-
Referee: [§4] §4 (Experimental evaluation): The comparison with vanilla CodeQL and Infer lacks an ablation that removes either the LLM summaries or the Z3 filtering steps, so the individual contribution of each component to the jump from 19/3 to 52 leaks cannot be quantified. In addition, no false-negative analysis or sampling of missed leaks is provided.
Authors: We concur that an ablation study is needed to isolate component contributions. We will add results in §4 for three configurations on the same seven projects: (i) vanilla CodeQL/Infer, (ii) augmented with LLM summaries only, and (iii) full pipeline with both LLM summaries and Z3 filtering. This will quantify the incremental gains. For false negatives we will sample 10 % of functions in the largest project, manually inspect for missed leaks, and report an estimated recall; we note that exhaustive ground truth is unavailable, but sampling provides a practical bound. revision: yes
-
Referee: [§4] §4 (Bug confirmation): The claim that 49 of the 52 leaks were independently confirmed/fixed is load-bearing for the central improvement claim, yet the manuscript gives no protocol for the confirmation process, inter-annotator agreement, or how false positives were ruled out.
Authors: We will expand §4 with a precise confirmation protocol: each of the 52 warnings was reviewed independently by two authors; disagreements were resolved by joint discussion until consensus. We will report the resulting inter-annotator agreement percentage. False positives were ruled out by (a) confirming via Z3 that no deallocation occurs on any feasible path and (b) manual inspection of the call graph and ownership flow. Where possible, confirmation was further supported by submitted patches or CVEs. These details will be added to the revised manuscript. revision: yes
Circularity Check
No significant circularity; empirical results on external projects with independent validation
full rationale
The paper describes a neuro-symbolic pipeline evaluated on seven external real-world C/C++ codebases (3.4M LOC total). Claims rest on direct comparisons to unmodified CodeQL and Infer baselines plus manual confirmation of detected bugs (49/52 fixed, CVEs submitted). No mathematical derivation chain, no fitted parameters renamed as predictions, no self-citations invoked as load-bearing uniqueness theorems, and no ansatz or renaming of known results. The LLM classification and Z3 filtering steps are tool components whose correctness is assessed via external outcomes rather than by construction from the same inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Z3 solver accurately determines path feasibility in the control-flow graph
- domain assumption LLM can reliably classify functions as allocators, deallocators, or neither based on code semantics
Reference graph
Works this paper leans on
-
[1]
Jinsheng Ba, Gregory J. Duck, and Abhik Roychoudhury. 2023. Efficient Greybox Fuzzing to Detect Memory Errors. InProceedings of the 37th IEEE/ACM Inter- national Conference on Automated Software Engineering(Rochester, MI, USA) (ASE ’22). Association for Computing Machinery, New York, NY, USA, Article 37, 12 pages. doi:10.1145/3551349.3561161
-
[2]
Max Brunsfeld and Tree sitter contributors. [n. d.]. Tree-sitter: An incremental parsing system for programming tools. https://tree-sitter.github.io/tree-sitter/. Accessed: 2026
work page 2026
-
[3]
Cristian Cadar, Daniel Dunbar, Dawson R Engler, et al. 2008. Klee: unassisted and automatic generation of high-coverage tests for complex systems programs.. InOSDI, Vol. 8. 209–224
work page 2008
-
[4]
Xi Chen, Asia Slowinska, and Herbert Bos. 2013. MemBrush: A practical tool to detect custom memory allocators in C binaries. In2013 20th Working Conference on Reverse Engineering (WCRE). 477–478. doi:10.1109/WCRE.2013.6671326
-
[5]
Sigmund Cherem, Lonnie Princehouse, and Radu Rugina. 2007. Practical memory leak detection using guarded value-flow analysis. InProceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation. 480– 491
work page 2007
-
[6]
Clang. Accessed 2026. LeakSanitizer: a run-time memory leak detector. https: //clang.llvm.org/docs/LeakSanitizer.html
work page 2026
-
[7]
William G. Cochran. 1977.Sampling Techniques, 3rd Edition. John Wiley
work page 1977
-
[8]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: an efficient SMT solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems(Budapest, Hungary)(TACAS’08/ETAPS’08). Springer-Verlag, Berlin, Heidelberg, 337–340
work page 2008
-
[9]
Guilherme Otávio de Sena and Rivalino Matias. 2018. A Systematic Map- ping Review of Memory Leak Detection Techniques. In2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). 264–270. doi:10.1109/ISSREW.2018.00017
-
[10]
Dino Distefano, Manuel Fähndrich, Francesco Logozzo, and Peter W. O’Hearn
-
[11]
Scaling static analyses at Facebook.Commun. ACM62, 8 (July 2019), 62–70. doi:10.1145/3338112
- [12]
-
[13]
Navid Emamdoost, Qiushi Wu, Kangjie Lu, and Stephen McCamant. 2021. Detect- ing kernel memory leaks in specialized modules with ownership reasoning. In The 2021 Annual Network and Distributed System Security Symposium (NDSS’21)
work page 2021
-
[14]
Facebook. 2026. Infer Static Analyzer. https://fbinfer.com. [Accessed 23-03-2026]
work page 2026
-
[15]
Gang Fan, Rongxin Wu, Qingkai Shi, Xiao Xiao, Jinguo Zhou, and Charles Zhang
-
[16]
In2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)
SMOKE: Scalable Path-Sensitive Memory Leak Detection for Millions of Lines of Code. In2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 72–82. doi:10.1109/ICSE.2019.00025
-
[17]
Zhiyu Fan, Xiang Gao, Martin Mirchev, Abhik Roychoudhury, and Shin Hwei Tan. 2023. Automated repair of programs from large language models. In2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1469–1481
work page 2023
-
[18]
Qing Gao, Yingfei Xiong, Yaqing Mi, Lu Zhang, Weikun Yang, Zhaoping Zhou, Bing Xie, and Hong Mei. 2015. Safe memory-leak fixing for c programs. In2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. IEEE, 459–470
work page 2015
-
[19]
GitHub. 2026. CodeQL. https://codeql.github.com/. [Accessed 23-03-2026]
work page 2026
-
[20]
GitHub. 2026. GitHub Actions workflow security analysis with CodeQL is now generally available. https://github.blog/changelog/2025-04-22-github-actions- workflow-security-analysis-with-codeql-is-now-generally-available/. [Ac- cessed 23-03-2026]
work page 2026
- [21]
-
[22]
Zhaoqiang Guo, Tingting Tan, Shiran Liu, Xutong Liu, Wei Lai, Yibiao Yang, Yanhui Li, Lin Chen, Wei Dong, and Yuming Zhou. 2023. Mitigating false pos- itive static analysis warnings: Progress, challenges, and opportunities.IEEE Transactions on Software Engineering49, 12 (2023), 5154–5188
work page 2023
-
[23]
Wookhyun Han, Byunggill Joe, Byoungyoung Lee, Chengyu Song, and Insik Shin
-
[24]
InNetwork and Distributed Systems Security (NDSS) Symposium 2018
Enhancing memory error detection for large-scale applications and fuzz testing. InNetwork and Distributed Systems Security (NDSS) Symposium 2018
work page 2018
-
[25]
Reed Hastings. 1992. Purify: Fast detection of memory leaks and access errors. InProceedings of the USENIX Winter’92 Conference. 125–136
work page 1992
-
[26]
David L Heine and Monica S Lam. 2003. A practical flow-sensitive and context- sensitive C and C++ memory leak detector. InProceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation. 168–181
work page 2003
-
[27]
Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John Grundy, and Haoyu Wang. 2024. Large Language Models for Software Engineering: A Systematic Literature Review.ACM Trans. Softw. Eng. Methodol.33, 8, Article 220 (Dec. 2024), 79 pages. doi:10.1145/3695988
-
[28]
Huimin Hu, Yingying Wang, Julia Rubin, and Michael Pradel. 2025. An Empirical Study of Suppressed Static Analysis Warnings.Proc. ACM Softw. Eng.2, FSE, Article FSE014 (June 2025), 22 pages. doi:10.1145/3715729
- [29]
-
[30]
Nihal Jain, Robert Kwiatkowski, Baishakhi Ray, Murali Krishna Ramanathan, and Varun Kumar. 2025. On Mitigating Code LLM Hallucinations with API Documentation. In2025 IEEE/ACM 47th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). 237–248. doi:10.1109/ ICSE-SEIP66354.2025.00027
-
[31]
Juyong Jiang, Fan Wang, Jiasi Shen, Sungju Kim, and Sunghun Kim. 2026. A Survey on Large Language Models for Code Generation.ACM Trans. Softw. Eng. Methodol.35, 2, Article 58 (Jan. 2026), 72 pages. doi:10.1145/3747588
-
[32]
Changhee Jung, Sangho Lee, Easwaran Raman, and Santosh Pande. 2014. Au- tomated memory leak detection for production use. InProceedings of the 36th International Conference on Software Engineering. 825–836
work page 2014
-
[33]
Yungbum Jung and Kwangkeun Yi. 2008. Practical memory leak detector based on parameterized procedural summaries. InProceedings of the 7th international symposium on Memory management. 131–140
work page 2008
-
[34]
Hong Jin Kang, Khai Loong Aw, and David Lo. 2022. Detecting false alarms from automatic static analysis tools: how far are we?. InProceedings of the 44th International Conference on Software Engineering(Pittsburgh, Pennsylvania)(ICSE ’22). Association for Computing Machinery, New York, NY, USA, 698–709. doi:10. 1145/3510003.3510214
- [35]
-
[36]
Claire Le Goues, Michael Pradel, and Abhik Roychoudhury. 2019. Automated program repair.Commun. ACM62, 12 (2019), 56–65
work page 2019
-
[37]
Haonan Li, Hang Zhang, Kexin Pei, and Zhiyun Qian. 2025. Towards More Accurate Static Analysis for Taint-Style Bug Detection in Linux Kernel. In2025 40th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 380–392
work page 2025
-
[38]
Kaixuan Li, Sen Chen, Lingling Fan, Ruitao Feng, Han Liu, Chengwei Liu, Yang Liu, and Yixiang Chen. 2023. Comparison and evaluation on static application security testing (sast) tools for java. InProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 921–933
work page 2023
-
[39]
Wen Li, Haipeng Cai, Yulei Sui, and David Manz. 2020. PCA: memory leak detection using partial call-path analysis. InProceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1621–1625
work page 2020
- [40]
- [41]
-
[42]
Huqiu Liu, Yuping Wang, Lingbo Jiang, and Shimin Hu. 2014. PF-Miner: A new paired functions mining method for Android kernel in error paths. In2014 IEEE 38th Annual Computer Software and Applications Conference. IEEE, 33–42
work page 2014
-
[43]
Hu-Qiu Liu, Jia-Ju Bai, Yu-Ping Wang, Zhe Bian, and Shi-Min Hu. 2015. Pairminer: mining for paired functions in Kernel extensions. In2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). IEEE, 93–101
work page 2015
-
[44]
LLVM Project. 2026. Clang Static Analyzer. https://clang-analyzer.llvm.org/. [Accessed 23-03-2026]
work page 2026
-
[45]
Yunlong Lyu, Yi Fang, Yiwei Zhang, Qibin Sun, Siqi Ma, Elisa Bertino, Kangjie Lu, and Juanru Li. 2022. Goshawk: Hunting Memory Corruptions via Structure- Aware and Object-Centric Memory Operation Synopsis. In43rd IEEE Symposium on Security and Privacy, SP 2022, San Francisco, CA, USA, May 22-26, 2022. IEEE, 2096–2113. doi:10.1109/SP46214.2022.9833613
-
[46]
Lezhi Ma, Shangqing Liu, Yi Li, Xiaofei Xie, and Lei Bu. 2025. Specgen: Automated generation of formal program specifications via large language models. In2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE). IEEE, 16–28
work page 2025
-
[47]
MITRE. 2026. CWE-401: Missing Release of Memory after Effective Lifetime (4.19.1). https://cwe.mitre.org/data/definitions/401.html. [Accessed 23-03-2026]
work page 2026
-
[48]
Aniruddhan Murali, Mahmoud Alfadel, Meiyappan Nagappan, Meng Xu, and Chengnian Sun. 2024. AddressWatcher: Sanitizer-Based Localization of Memory Leak Fixes.IEEE Transactions on Software Engineering50, 9 (2024), 2398–2411. doi:10.1109/TSE.2024.3438119
-
[49]
Darragh Murphy. 2025. This hidden Windows 11 setting might be quietly draining your RAM. https://tech.yahoo.com/computing/articles/hidden-windows-11- setting-might-121538216.html. [Accessed 23-03-2026]
work page 2025
-
[50]
Tukaram Muske and Alexander Serebrenik. 2016. Survey of approaches for handling static analysis alarms. In2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM). IEEE, 157–166
work page 2016
-
[51]
Tukaram Muske and Alexander Serebrenik. 2022. Survey of Approaches for Postprocessing of Static Analysis Alarms.ACM Comput. Surv.55, 3, Article 48 (Feb. 2022), 39 pages. doi:10.1145/3494521
-
[52]
Nicholas Nethercote and Julian Seward. 2007. Valgrind: a framework for heavy- weight dynamic binary instrumentation.ACM Sigplan notices42, 6 (2007), 89–100
work page 2007
-
[53]
OpenCVE. 2026. CVEs and Security Vulnerabilities. https://app.opencve.io/cve/ ?weakness=CWE-401. [Accessed 23-03-2026]
work page 2026
-
[54]
OpenSSL. 2026. openssl/openssl: TLS/SSL and crypto library. https://github.com/ openssl/openssl. [Accessed 23-03-2026]
work page 2026
-
[55]
FreeRDP Project. Accessed 2026. FreeRDP: A Remote Desktop Protocol Imple- mentation. https://www.freerdp.com
work page 2026
-
[56]
LLVM Project. Accessed 2026. libFuzzer: a library for coverage-guided fuzz testing. https://llvm.org/docs/LibFuzzer.html
work page 2026
-
[57]
Rapid7. 2025. MongoBleed CVE-2025-1484: Critical Memory Leak in MongoDB Allowing Attackers to Extract Sensitive Data. https: //www.rapid7.com/blog/post/etr-mongobleed-cve-2025-1484-critical-memory- leak-in-mongodb-allowing-attackers-to-extract-sensitive-data/. [Accessed 23-03-2026]
work page 2025
-
[58]
Suman Saha, Jean-Pierre Lozi, Gaël Thomas, Julia L Lawall, and Gilles Muller
-
[59]
In2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)
Hector: Detecting resource-release omission faults in error-handling code for systems software. In2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 1–12
- [60]
-
[61]
Kostya Serebryany. 2017. OSS-Fuzz - Google’s continuous fuzzing service for open source software. USENIX Association, Vancouver, BC
work page 2017
-
[62]
Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitry Vyukov. 2012. AddressSanitizer: a fast address sanity checker. InProceedings of the 2012 USENIX Conference on Annual Technical Conference(Boston, MA) (USENIX ATC’12). USENIX Association, USA, 28
work page 2012
-
[63]
Ekaterina Shemetova, Ivan Smirnov, Anton Alekseev, Ilya Shenbin, Alexey Rukhovich, Sergey Nikolenko, Vadim Lomshakov, and Irina Piontkovskaya. 2025. LAMeD: LLM-generated Annotations for Memory Leak Detection. InProceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering (EASE ’25). Association for Computing Machin...
-
[64]
Jieke Shi, Zhou Yang, and David Lo. 2025. Efficient and Green Large Language Models for Software Engineering: Literature Review, Vision, and the Road Ahead. ACM Trans. Softw. Eng. Methodol.34, 5, Article 137 (May 2025), 22 pages. doi:10. 1145/3708525
work page 2025
-
[65]
Qingkai Shi, Xiao Xiao, Rongxin Wu, Jinguo Zhou, Gang Fan, and Charles Zhang
-
[66]
InProceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation
Pinpoint: Fast and precise sparse value flow analysis for million lines of code. InProceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. 693–706
-
[67]
Yulei Sui, Ding Ye, and Jingling Xue. 2014. Detecting Memory Leaks Statically with Full-Sparse Value-Flow Analysis.IEEE Trans. Software Eng.40, 2 (2014), 107–122. doi:10.1109/TSE.2014.2302311
- [68]
-
[69]
Anthony J Viera, Joanne M Garrett, et al . 2005. Understanding interobserver agreement: the kappa statistic.Fam med37, 5 (2005), 360–363
work page 2005
-
[70]
Jianqiang Wang, Siqi Ma, Yuanyuan Zhang, Juanru Li, Zheyu Ma, Long Mai, Tiancheng Chen, and Dawu Gu. 2019. NLP-EYE: Detecting Memory Cor- ruptions via Semantic-Aware Memory Operation Function Identification. In 22nd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2019). USENIX Association, Chaoyang District, Beijing, 309–321....
work page 2019
-
[71]
Cheng Wen, Jialun Cao, Jie Su, Zhiwu Xu, Shengchao Qin, Mengda He, Haokun Li, Shing-Chi Cheung, and Cong Tian. 2024. Enchanting program specification synthesis by large language models using static analysis and program verification. InInternational Conference on Computer Aided Verification. Springer, 302–328
work page 2024
-
[72]
Cheng Wen, Haijun Wang, Yuekang Li, Shengchao Qin, Yang Liu, Zhiwu Xu, Hongxu Chen, Xiaofei Xie, Geguang Pu, and Ting Liu. 2020. MemLock: memory usage guided fuzzing. InProceedings of the ACM/IEEE 42nd International Confer- ence on Software Engineering(Seoul, South Korea)(ICSE ’20). Association for Com- puting Machinery, New York, NY, USA, 765–777. doi:10...
-
[73]
Yichen Xie and Alex Aiken. 2005. Context-and path-sensitive memory leak detection. InProceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering. 115–125
work page 2005
-
[74]
Wen Xu, Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. 2017. Designing New Operating Primitives to Improve Fuzzing Performance. InProceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security(Dallas, Texas, USA)(CCS ’17). Association for Computing Machinery, New York, NY, USA, 2313–2328. doi:10.1145/3133956.3134046
-
[75]
Duo Zhang, Om Rameshwar Gatla, Wei Xu, and Mai Zheng. 2021. A study of persistent memory bugs in the Linux kernel. InProceedings of the 14th ACM International Conference on Systems and Storage(Haifa, Israel)(SYSTOR ’21). Association for Computing Machinery, New York, NY, USA, Article 6, 6 pages. doi:10.1145/3456727.3463783
-
[76]
Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, and Abhik Roychoudhury. 2024. Au- tocoderover: Autonomous program improvement. InProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 1592–1604
work page 2024
-
[77]
Ziyao Zhang, Chong Wang, Yanlin Wang, Ensheng Shi, Yuchi Ma, Wanjun Zhong, Jiachi Chen, Mingzhi Mao, and Zibin Zheng. 2025. LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation.Proc. ACM Softw. Eng.2, ISSTA, Article ISSTA022 (June 2025), 23 pages. doi:10.1145/3728894 12
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.