pith. sign in

arxiv: 2605.22087 · v1 · pith:TO4QLYMNnew · submitted 2026-05-21 · 💻 cs.SE · cs.CR

Automated Repair of TEE Partitioning Issues via DSL-Guided and LLM-Assisted Patching

Pith reviewed 2026-05-22 04:48 UTC · model grok-4.3

classification 💻 cs.SE cs.CR
keywords trusted execution environmentspartitioning errorsautomated repairdomain-specific languagelarge language modelssecurity patchescode repairvulnerability mitigation
0
0 comments X

The pith

TEERepair uses a domain-specific language and LLMs to automatically repair partitioning errors in trusted execution environments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces TEERepair to fix improper partitioning in TEE applications that can cause data leakage or code injection when code interacts with the untrusted OS. It defines a domain-specific language to capture common security patterns as reusable patch templates with placeholders. Large language models then fill those templates with context from the code and produce tests to check the fixes. Evaluation on the PartitioningE-Bench shows an 87.6 percent repair success rate, higher than baselines, and the system generated five pull requests for real projects with two merged. This matters because TEE code is security-critical yet difficult to partition correctly by hand.

Core claim

We present TEERepair, a framework for automatically repairing bad partitioning issues in TEE applications. Our approach tackles the challenges by introducing a domain-specific language to encode repair rules that express and capture common TEE security patterns, which are instantiated as patch templates with placeholders for context-specific variables. We then leverage large language models to reason about code semantics and synthesize context-aware patches, and further generate test clients to validate the repairs. We evaluate TEERepair on the TEE Partitioning Errors Benchmark, achieving a significantly higher repair success rate of 87.6% compared to baselines. Furthermore, applying TEERepA

What carries the argument

A domain-specific language that encodes TEE security patterns as patch templates with placeholders, which guides LLMs to produce context-specific repairs and validation tests.

If this is right

  • Partitioning errors in TEE code can be repaired automatically at high rates instead of requiring manual fixes.
  • DSL-defined rules combined with LLMs produce patches that maintain security properties in low-level C code.
  • Generated test clients can validate repairs without needing extensive separate test suites.
  • Real-world TEE projects receive usable patch suggestions that maintainers accept and merge.
  • A dedicated benchmark enables direct comparison of future repair tools for this class of errors.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same DSL-plus-LLM structure could guide repairs for other low-level security issues such as memory isolation mistakes in embedded systems.
  • Combining TEERepair with existing static analysis detectors would create an end-to-end find-and-fix pipeline for TEE applications.
  • Success here indicates that structured domain rules can make LLMs reliable for security-sensitive code changes where pure prompting often fails.
  • Extending the DSL to newer TEE hardware features could test whether the approach scales as isolation mechanisms evolve.

Load-bearing premise

The domain-specific language encodes the majority of common TEE security patterns so LLMs can generate correct patches that do not introduce new vulnerabilities.

What would settle it

A collection of new TEE partitioning errors outside the current benchmark where TEERepair's success rate falls below 50 percent or where generated patches fail validation by creating leaks or injections.

Figures

Figures reproduced from arXiv: 2605.22087 by Chengyan Ma, David Lo, Feng Li, Jieke Shi, Ruidong Han, Ye Liu, Yuqing Niu.

Figure 1
Figure 1. Figure 1: System architecture of TEE [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview workflow of TEERepair. The 6 steps are: [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Domain-Specific Language Grammar. 15 of [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Rule 1: Unencrypted data output repaired by adding encryption. [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Rule 2: Input validation weaknesses repaired by inserting a guard. [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Rule 3: Direct usage of shared memory repaired by deep copy and integrity check. [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Prompt for generating the repair patch. been modified. The template in Fig. 7b performs deep copy (line 3) to replace shallow copies (e.g., line 1), with integrity checks in lines 4 to 10. For operations modifying shared memory (line 1 of Fig. 7d), Rule 3.2 ensures changes are first performed on the deep copy buf, then copied back after updating the secure object. As shown in Fig. 7d, operations on params[… view at source ↗
Figure 9
Figure 9. Figure 9: Workflow of generating the test client for a TEE application. [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Undetected issues of unencrypted data out [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗
Figure 12
Figure 12. Figure 12: Overfitting repair on the issues of unencrypted data output by ChatRepair. [PITH_FULL_IMAGE:figures/full_fig_p016_12.png] view at source ↗
read the original abstract

Trusted Execution Environments (TEEs) provide hardware-based isolation to protect sensitive data and computations from potentially compromised operating systems (OS). However, TEE applications inevitably interact with the untrusted OS through SDK interfaces, and improper partitioning can introduce severe vulnerabilities such as data leakage and code injection. While prior work has proposed static analysis tools to detect such issues, automated repair remains largely unexplored. This problem is particularly challenging due to three TEE-specific factors: the lack of standardized secure development guidelines, the difficulty of extracting semantic information from low-level C code, and the absence of mature testing and validation methods. In this work, we present TEERepair, a framework for automatically repairing bad partitioning issues in TEE applications. Our approach tackles the above challenges by introducing a domain-specific language (DSL) to encode repair rules that express and capture common TEE security patterns, which are instantiated as patch templates with placeholders for context-specific variables. We then leverage large language models (LLMs) to reason about code semantics and synthesize context-aware patches, and further generate test clients to validate the repairs. We evaluate TEERepair on the TEE Partitioning Errors Benchmark (PartitioningE-Bench), achieving a significantly higher repair success rate of 87.6% compared to baselines. Furthermore, applying TEERepair to real-world TEE projects, we submitted 5 repair pull requests, 2 of which have been confirmed and merged by project maintainers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents TEERepair, a framework for automated repair of TEE partitioning issues in C-based applications. It introduces a DSL to encode common TEE security patterns as reusable patch templates, uses LLMs to instantiate context-specific patches, and generates test clients for validation. The approach is evaluated on the newly introduced PartitioningE-Bench benchmark, where it achieves an 87.6% repair success rate outperforming baselines, and is applied to real-world TEE projects yielding five pull requests of which two were merged by maintainers.

Significance. If the empirical results hold under more rigorous validation, the work would make a meaningful contribution to automated security repair in the TEE domain by bridging static analysis detection with practical patching. Strengths include the creation of PartitioningE-Bench as a reusable artifact and the demonstration of real-world impact through merged pull requests, which provide external validation independent of the authors' metrics. The DSL-guided plus LLM synthesis combination addresses a genuine gap where prior tools stop at detection.

major comments (2)
  1. [§5] §5 (Evaluation) and abstract: the central claim of an 87.6% success rate and superiority over baselines is presented without definitions of the baseline systems, implementation details of the comparison, statistical significance tests, or false-positive rates for accepted patches. This information is load-bearing for interpreting whether the result demonstrates a genuine advance rather than an artifact of evaluation setup.
  2. [§4.3] §4.3 (Test Client Generation) and §5.2: patch correctness is asserted on the basis of LLM-generated test clients, yet the manuscript provides no coverage metrics, adversarial test suites targeting TEE-specific regressions (data leakage, isolation violations, injection), or manual audit of the accepted patches. Because the success rate and merged PRs rest on these tests being sufficient to expose new vulnerabilities, the absence of such evidence directly undermines the headline empirical result.
minor comments (2)
  1. [Abstract] Abstract: the phrase 'significantly higher repair success rate' should be accompanied by the exact baseline rates and the statistical test used, even in the abstract.
  2. [§3] Notation: the DSL syntax and placeholder semantics are introduced without a compact grammar or example derivation; a small table or figure showing one full rule-to-patch expansion would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and describe the revisions we will make to strengthen the evaluation and validation sections.

read point-by-point responses
  1. Referee: [§5] §5 (Evaluation) and abstract: the central claim of an 87.6% success rate and superiority over baselines is presented without definitions of the baseline systems, implementation details of the comparison, statistical significance tests, or false-positive rates for accepted patches. This information is load-bearing for interpreting whether the result demonstrates a genuine advance rather than an artifact of evaluation setup.

    Authors: We agree that the current presentation of baselines and comparison details is insufficient for full reproducibility and interpretation. In the revised manuscript we will add a new subsection in §5 that (1) explicitly defines each baseline system and its adaptation to TEE partitioning repair, (2) provides implementation details and hyperparameters used in the comparison, (3) reports statistical significance tests (McNemar’s test for paired success rates and Wilcoxon signed-rank test across benchmark instances), and (4) includes a false-positive analysis based on manual inspection of a random sample of accepted patches. These additions will make the 87.6% claim and its superiority over baselines fully substantiated. revision: yes

  2. Referee: [§4.3] §4.3 (Test Client Generation) and §5.2: patch correctness is asserted on the basis of LLM-generated test clients, yet the manuscript provides no coverage metrics, adversarial test suites targeting TEE-specific regressions (data leakage, isolation violations, injection), or manual audit of the accepted patches. Because the success rate and merged PRs rest on these tests being sufficient to expose new vulnerabilities, the absence of such evidence directly undermines the headline empirical result.

    Authors: We acknowledge that additional evidence is required to demonstrate that the LLM-generated test clients are adequate for validating TEE-specific repairs. In the revision we will (1) report statement and branch coverage metrics obtained with gcov on the generated clients, (2) describe and evaluate an adversarial test suite we constructed that specifically targets data leakage, isolation violations, and injection attacks, and (3) present results of a manual audit performed on a 25% random sample of accepted patches. While the two merged pull requests already supply external confirmation, these internal metrics will directly address the concern about test sufficiency. revision: yes

Circularity Check

0 steps flagged

No significant circularity in TEERepair evaluation chain

full rationale

The paper introduces TEERepair as an empirical framework combining a DSL for TEE security patterns with LLM-based patch synthesis and test client generation. Its primary claims rest on a measured 87.6% repair success rate against baselines on the newly created PartitioningE-Bench plus external validation via five real-world pull requests (two merged by maintainers). These outcomes rely on independent benchmarks and third-party acceptance rather than any self-definitional reduction, fitted parameter renamed as prediction, or load-bearing self-citation. No equations, uniqueness theorems, or ansatzes appear in the derivation; success is defined externally to the method's internal templates and LLM calls. The evaluation therefore remains self-contained against external criteria.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The framework rests on the unproven assumption that a compact DSL can capture the relevant security patterns and that current LLMs possess sufficient semantic understanding of low-level C to produce safe patches; no free parameters are described.

axioms (2)
  • domain assumption LLMs can reliably extract semantic information from low-level C code for the purpose of patch synthesis
    Invoked when the paper states that LLMs are used to reason about code semantics and synthesize context-aware patches.
  • ad hoc to paper Common TEE partitioning security patterns can be expressed as reusable DSL rules that generalize across applications
    Central to the DSL-guided component described in the abstract.
invented entities (1)
  • DSL for TEE repair rules and patch templates no independent evidence
    purpose: To encode common security patterns as instantiable templates with placeholders
    Newly introduced construct that guides the patching process

pith-pipeline@v0.9.0 · 5811 in / 1556 out tokens · 41312 ms · 2026-05-22T04:48:51.784541+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

85 extracted references · 85 canonical work pages

  1. [1]

    Sharmin Afrose, Sazzadur Rahaman, and Danfeng Yao. 2019. CryptoAPI-Bench: A Comprehensive Benchmark on Java Cryptographic API Misuses. In2019 IEEE Cybersecurity Development (SecDev). 49–61

  2. [2]

    Zaheer Ahmad, Lishoy Francis, Tansir Ahmed, Christopher Lobodzinski, Dev Audsin, and Peng Jiang. 2013. Enhancing the Security of Mobile Applications by Using TEE and (U)SIM. In2013 IEEE 10th International Conference on Ubiquitous Intelligence and Computing and 2013 IEEE 10th International Conference on Autonomic and Trusted Computing. 575–582

  3. [3]

    Islem Bouzenia, Premkumar Devanbu, and Michael Pradel. 2025. RepairAgent: An Autonomous, LLM-Based Agent for Program Repair. In2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE). 2188–2200

  4. [4]

    Davide Bove. 2024. A Large-Scale Study on the Prevalence and Usage of TEE-based Features on Android. InProceedings of the 19th International Conference on A vailability, Reliability and Security(Vienna, Austria)(ARES ’24). Association for Computing Machinery, New York, NY, USA, Article 29, 11 pages

  5. [5]

    Marcel Busch, Aravind Machiry, Chad Spensky, Giovanni Vigna, Christopher Kruegel, and Mathias Payer. 2023. TEEzz: Fuzzing Trusted Applications on COTS Android Devices. In2023 IEEE Symposium on Security and Privacy (SP). 1204–1219

  6. [6]

    Liheng Chen, Zheming Li, Zheyu Ma, Yuan Li, Baojian Chen, and Chao Zhang. 2024. EnclaveFuzz: Finding Vulnerabili- ties in SGX Applications. In31st Annual Network and Distributed System Security Symposium, NDSS 2024, San Diego, California, USA, February 26 - March 1, 2024. The Internet Society

  7. [7]

    Zimin Chen, Steve Kommrusch, Michele Tufano, Louis-Noël Pouchet, Denys Poshyvanyk, and Martin Monperrus

  8. [8]

    SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair.IEEE Transactions on Software Engineering47, 9 (2021), 1943–1959

  9. [9]

    Intel Corporation. 2020. Intel(R) Software Guard Extensions Developer Guide. https://community.intel.com/legacyfs/ online/drupal_files/managed/33/70/intel-sgx-developer-guide.pdf

  10. [10]

    Guoyun Duan, Yuanzhi Fu, Boyang Zhang, Peiyao Deng, Jianhua Sun, Hao Chen, and Zhiwen Chen. 2023. TEEFuzzer: A fuzzing framework for trusted execution environments with heuristic seed mutation.Future Gener. Comput. Syst. 144, C (July 2023), 192–204

  11. [11]

    Thomas Durieux and Martin Monperrus. 2016. DynaMoth: dynamic code synthesis for automatic program repair. In Proceedings of the 11th International Workshop on Automation of Software Test(Austin, Texas)(AST ’16). Association for Computing Machinery, New York, NY, USA, 85–91

  12. [12]

    Jan-Erik Ekberg. 2016. Hardware Isolation for Trusted Execution. InProceedings of the 6th Workshop on Security and Privacy in Smartphones and Mobile Devices(Vienna, Austria)(SPSM ’16). Association for Computing Machinery, New York, NY, USA, 1

  13. [13]

    Yongkai Fan, Guanqun Zhao, Kuan-Ching Li, Bin Zhang, Gang Tan, Xiaofeng Sun, and Fanglue Xia. 2020. SNPL: One Scheme of Securing Nodes in IoT Perception Layer.Sensors20, 4 (2020)

  14. [14]

    Fabian Fleischer, Marcel Busch, and Phillip Kuhrt. 2020. Memory corruption attacks within Android TEEs: a case study based on OP-TEE. InProceedings of the 15th International Conference on A vailability, Reliability and Security(Virtual Event, Ireland)(ARES ’20). Association for Computing Machinery, New York, NY, USA, Article 53, 9 pages

  15. [15]

    Michael Fu, Chakkrit Tantithamthavorn, Trung Le, Van Nguyen, and Dinh Phung. 2022. VulRepair: a T5-based automated software vulnerability repair. InProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering(Singapore, Singapore)(ESEC/FSE 2022). Association for Computing Machinery, Ne...

  16. [16]

    Duck, Ruyi Ji, Yingfei Xiong, and Abhik Roychoudhury

    Xiang Gao, Bo Wang, Gregory J. Duck, Ruyi Ji, Yingfei Xiong, and Abhik Roychoudhury. 2021. Beyond Tests: Program Vulnerability Repair via Crash Constraint Extraction.ACM Trans. Softw. Eng. Methodol.30, 2, Article 14 (Feb. 2021), 27 pages

  17. [17]

    GitHub. 2025. CodeQL — codeql.github.com. https://codeql.github.com

  18. [18]

    GlobalPlatform, Inc. 2018. TEE Internal Core API Specification Version 1.1.2.50. https://globalplatform.org/wp- content/uploads/2018/06/GPD_TEE_Internal_Core_API_Specification_v1.1.2.50_PublicReview.pdf

  19. [19]

    Yiwei Hu, Zhen Li, Kedie Shu, Shenghua Guan, Deqing Zou, Shouhuai Xu, Bin Yuan, and Hai Jin. 2025. SoK: Automated Vulnerability Repair: Methods, Tools, and Assessments. In34th USENIX Security Symposium, USENIX Security 2025, Seattle, W A, USA, August 13-15, 2025, Lujo Bauer and Giancarlo Pellegrino (Eds.). USENIX Association, 4421–4440

  20. [20]

    Kai Huang, Zhengzi Xu, Su Yang, Hongyu Sun, Xuejun Li, Zheng Yan, and Yuqing Zhang. 2024. Evolving Paradigms in Automated Program Repair: Taxonomy, Challenges, and Opportunities.ACM Comput. Surv.57, 2, Article 36 (Oct. 2024), 43 pages

  21. [21]

    Huawei Technologies Co., Ltd. 2019. Huawei iTrustee V3.0 on Kirin 980 Security Target. https://messervices.cyber. gouv.fr/visas/ANSSI-CC-2020-67-cible.pdf

  22. [22]

    Jiajun Jiang, Yingfei Xiong, Hongyu Zhang, Qing Gao, and Xiangqun Chen. 2018. Shaping program repair space with existing patches and similar code. InProceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis(Amsterdam, Netherlands)(ISSTA 2018). Association for Computing Machinery, New York, NY, USA, Proc. ACM Softw. Eng.,...

  23. [23]

    Nan Jiang, Thibaud Lutellier, and Lin Tan. 2021. CURE: Code-Aware Neural Machine Translation for Automatic Program Repair. In2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 1161–1173

  24. [24]

    Mustakimur Rahman Khandaker, Yueqiang Cheng, Zhi Wang, and Tao Wei. 2020. COIN Attacks: On Insecurity of Enclave Untrusted Interfaces in SGX. InProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems(Lausanne, Switzerland)(ASPLOS ’20). Association for Computing Machinery, New York, ...

  25. [25]

    Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic patch generation learned from human-written patches. In2013 35th International Conference on Software Engineering (ICSE). 802–811

  26. [26]

    Youngjoon Kim, Sunguk Shin, Hyoungshick Kim, and Jiwon Yoon. 2025. Logs In, Patches Out: Automated Vulnerability Repair via Tree-of-Thought LLM Analysis. In34th USENIX Security Symposium, USENIX Security 2025, Seattle, W A, USA, August 13-15, 2025, Lujo Bauer and Giancarlo Pellegrino (Eds.). USENIX Association, 4401–4419

  27. [27]

    Jiaolong Kong, Xiaofei Xie, Mingfei Cheng, Shangqing Liu, Xiaoning Du, and Qi Guo. 2025. ContrastRepair: Enhancing Conversation-Based Automated Program Repair via Contrastive Test Case Pairs.ACM Trans. Softw. Eng. Methodol.34, 8, Article 216 (Oct. 2025), 31 pages

  28. [28]

    Bissyandé, Dongsun Kim, Jacques Klein, Martin Monperrus, and Yves Le Traon

    Anil Koyuncu, Kui Liu, Tegawendé F. Bissyandé, Dongsun Kim, Jacques Klein, Martin Monperrus, and Yves Le Traon

  29. [29]

    Engg.25, 3 (May 2020), 1980–2024

    FixMiner: Mining relevant fix patterns for automated program repair.Empirical Softw. Engg.25, 3 (May 2020), 1980–2024

  30. [30]

    Shubham Kumar Lala, Akshat Kumar, and Subbulakshmi T. 2021. Secure Web development using OWASP Guidelines. In2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS). 323–332

  31. [31]

    Titouan Lazard, Johannes Götzfried, Tilo Müller, Gianni Santinelli, and Vincent Lefebvre. 2018. TEEshift: Protecting Code Confidentiality by Selectively Shifting Functions into TEEs. InProceedings of the 3rd Workshop on System Software for Trusted Execution(Toronto, Canada)(SysTEX ’18). Association for Computing Machinery, New York, NY, USA, 14–19

  32. [32]

    Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer. 2012. GenProg: A Generic Method for Automatic Software Repair.IEEE Transactions on Software Engineering38, 1 (2012), 54–72

  33. [33]

    Ding Li, Ziqi Zhang, Mengyu Yao, Yifeng Cai, Yao Guo, and Xiangqun Chen. 2025. TEESlice: Protecting Sensitive Neural Network Models in Trusted Execution Environments when Attackers Have Pre-Trained Models.ACM Trans. Softw. Eng. Methodol.34, 6, Article 166 (July 2025), 49 pages

  34. [34]

    Wenhao Li, Yubin Xia, and Haibo Chen. 2019. Research on ARM TrustZone.GetMobile: Mobile Comp. and Comm.22, 3 (Jan. 2019), 17–22

  35. [35]

    Yi Li, Shaohua Wang, and Tien N. Nguyen. 2020. DLFix: context-based code transformation learning for automated program repair. InProceedings of the ACM/IEEE 42nd International Conference on Software Engineering(Seoul, South Korea)(ICSE ’20). Association for Computing Machinery, New York, NY, USA, 602–614

  36. [36]

    Yi Li, Shaohua Wang, and Tien N. Nguyen. 2022. DEAR: a novel deep learning-based approach for automated program repair. InProceedings of the 44th International Conference on Software Engineering(Pittsburgh, Pennsylvania)(ICSE ’22). Association for Computing Machinery, New York, NY, USA, 511–523

  37. [37]

    Bissyandé

    Kui Liu, Anil Koyuncu, Dongsun Kim, and Tegawendé F. Bissyandé. 2019. TBar: revisiting template-based automated program repair. InProceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (Beijing, China)(ISSTA 2019). Association for Computing Machinery, New York, NY, USA, 31–42

  38. [38]

    Bissyandè

    Kui Liu, Anil Koyuncu, Dongsun Kim, and Tegawende F. Bissyandè. 2019. AVATAR: Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations. In2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). 1–12

  39. [39]

    Fan Long and Martin Rinard. 2016. Automatic patch generation by learning correct code. InProceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages(St. Petersburg, FL, USA)(POPL ’16). Association for Computing Machinery, New York, NY, USA, 298–312

  40. [40]

    Di Lu, Minqiang Shi, Xindi Ma, Ximeng Liu, Rui Guo, Tianfang Zheng, Yulong Shen, Xuewen Dong, and Jianfeng Ma

  41. [41]

    Smaug: A TEE-Assisted Secured SQLite for Embedded Systems.IEEE Transactions on Dependable and Secure Computing20, 5 (2023), 3617–3635

  42. [42]

    Thibaud Lutellier, Hung Viet Pham, Lawrence Pang, Yitong Li, Moshi Wei, and Lin Tan. 2020. CoCoNuT: combining context-aware neural translation models using ensemble for program repair. InProceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis(Virtual Event, USA)(ISSTA 2020). Association for Computing Machinery, New Yor...

  43. [43]

    Chengyan Ma, Ruidong Han, Jieke Shi, Ye Liu, Yuqing Niu, Di Lu, Chuang Tian, Jianfeng Ma, Debin Gao, and David Lo

  44. [44]

    arXiv:2502.15281 [cs.CR]

    DITING: A Static Analyzer for Identifying Bad Partitioning Issues in TEE Applications. arXiv:2502.15281 [cs.CR]

  45. [45]

    Chengyan Ma, Di Lu, Chaoyue Lv, Ning Xi, Xiaohong Jiang, Yulong Shen, and Jianfeng Ma. 2024. BiTDB: Constructing A Built-in TEE Secure Database for Embedded Systems.IEEE Transactions on Knowledge and Data Engineering36, 9 Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE200. Publication date: July 2026. Automated Repair of TEE Partitioning Issues via DS...

  46. [46]

    Pieter Maene, Johannes Götzfried, Ruan de Clercq, Tilo Müller, Felix Freiling, and Ingrid Verbauwhede. 2018. Hardware- Based Trusted Computing Architectures for Isolation and Attestation.IEEE Trans. Comput.67, 3 (2018), 361–374

  47. [47]

    Matias Martinez and Martin Monperrus. 2016. ASTOR: a program repair library for Java (demo). InProceedings of the 25th International Symposium on Software Testing and Analysis(Saarbrücken, Germany)(ISSTA 2016). Association for Computing Machinery, New York, NY, USA, 441–444

  48. [48]

    Matias Martinez and Martin Monperrus. 2018. Ultra-Large Repair Search Space with Automatically Mined Templates: The Cardumen Mode of Astor. InSearch-Based Software Engineering - 10th International Symposium, SSBSE 2018, Montpellier, France, September 8-9, 2018, Proceedings (Lecture Notes in Computer Science, Vol. 11036), Thelma Elita Colanzi and Phil McMi...

  49. [49]

    Nuthan Munaiah, Steven Kroh, Craig Cabrey, and Meiyappan Nagappan. 2017. Curating GitHub for engineered software projects.Empirical Softw. Engg.22, 6 (Dec. 2017), 3219–3253

  50. [50]

    Tsunato Nakai, Daisuke Suzuki, and Takeshi Fujino. 2021. Towards Trained Model Confidentiality and Integrity Using Trusted Execution Environments. InApplied Cryptography and Network Security Workshops: ACNS 2021 Satellite Workshops, AIBlock, AIHWS, AIoTS, CIMSS, Cloud S&P, SCI, SecMT, and SiMLA, Kamakura, Japan, June 21–24, 2021, Proceedings(Kamakura, Jap...

  51. [51]

    Peng Ning. 2014. Samsung KNOX and Enterprise Mobile Security. InProceedings of the 4th ACM Workshop on Security and Privacy in Smartphones & Mobile Devices(Scottsdale, Arizona, USA)(SPSM ’14). Association for Computing Machinery, New York, NY, USA, 1

  52. [52]

    Yuqing Niu, Jieke Shi, Ruidong Han, Ye Liu, Chengyan Ma, Yunbo Lyu, and David Lo. 2026. What You Trust Is Insecure: Demystifying How Developers (Mis)Use Trusted Execution Environments in Practice. arXiv:2512.17363 [cs.SE]

  53. [53]

    Minkyung Park, Jeongnyeo Kim, Youngho Kim, Eunsang Cho, Soobin Park, Sungmin Sohn, Minhyeok Kang, and Ted Taekyoung Kwon. 2020. An SGX-Based Key Management Framework for Data Centric Networking.IEEE Access8 (2020), 45198–45210

  54. [54]

    Abhik Roychoudhury. 2025. Agentic AI for Software: thoughts from Software Engineering community. arXiv:2508.17343 [cs.SE]

  55. [55]

    Haifeng Ruan, Yuntong Zhang, and Abhik Roychoudhury. 2025. SpecRover: Code Intent Extraction via LLMs. In2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE). 963–974

  56. [56]

    Gianluca Scopelliti, Christoph Baumann, and Jan Tobias Mühlberg. 2024. Understanding Trust Relationships in Cloud- Based Confidential Computing. In2024 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). 169–176

  57. [57]

    Ridwan Shariffdeen, Yannic Noller, Lars Grunske, and Abhik Roychoudhury. 2021. Concolic program repair. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (Virtual, Canada)(PLDI 2021). Association for Computing Machinery, New York, NY, USA, 390–405

  58. [58]

    Timperley, Yannic Noller, Claire Le Goues, and Abhik Roychoudhury

    Ridwan Shariffdeen, Christopher S. Timperley, Yannic Noller, Claire Le Goues, and Abhik Roychoudhury. 2025. Vulnerability Repair via Concolic Execution and Code Mutations.ACM Trans. Softw. Eng. Methodol.34, 4, Article 105 (April 2025), 27 pages

  59. [59]

    Raksha Sharma. 2024. Trusted Execution Environment Market Research Report 2033. https://growthmarketreports. com/report/trusted-execution-environment-market

  60. [60]

    Alex Shaw, Dusten Doggett, and Munawar Hafiz. 2014. Automatically Fixing C Buffer Overflows Using Program Transformations. In2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. 124–135

  61. [61]

    Dongwook Shim and Dong Hoon Lee. 2021. SOTPM: Software One-Time Programmable Memory to Protect Shared Memory on ARM Trustzone.IEEE Access9 (2021), 4490–4504

  62. [62]

    The KLEE Team. 2026. KLEE — klee-se.org. https://klee-se.org/

  63. [63]

    TrustedFirmware.org. 2023. About OP-TEE. https://optee.readthedocs.io/en/latest/general/about.html

  64. [64]

    Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2019. An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation.ACM Trans. Softw. Eng. Methodol.28, 4, Article 19 (Sept. 2019), 29 pages

  65. [65]

    Mark Vella, Christian Colombo, Robert Abela, and Peter Spacek. 2021. RV-TEE: secure cryptographic protocol execution based on runtime verification.J. Comput. Virol. Hacking Tech.17, 3 (2021), 229–248

  66. [66]

    Yuxiang Wei, Chunqiu Steven Xia, and Lingming Zhang. 2023. Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair. InProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering(San Francisco, CA, USA)(ESEC/FSE 2023). Association for ...

  67. [67]

    Chunqiu Steven Xia, Yifeng Ding, and Lingming Zhang. 2023. The Plastic Surgery Hypothesis in the Era of Large Language Models. In2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). 522–534. Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE200. Publication date: July 2026. FSE200:24 Chengyan Ma, Jieke Shi, Ruidong Han, Ye...

  68. [68]

    Chunqiu Steven Xia, Yuxiang Wei, and Lingming Zhang. 2023. Automated Program Repair in the Era of Large Pre- trained Language Models. In2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 1482–1494

  69. [69]

    Chunqiu Steven Xia and Lingming Zhang. 2022. Less training, more repairing please: revisiting automated program repair via zero-shot learning. InProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering(Singapore, Singapore)(ESEC/FSE 2022). Association for Computing Machinery, New Y...

  70. [70]

    Chunqiu Steven Xia and Lingming Zhang. 2023. Conversational Automated Program Repair. arXiv:2301.13246 [cs.SE]

  71. [71]

    Chunqiu Steven Xia and Lingming Zhang. 2024. Automated Program Repair via Conversation: Fixing 162 out of 337 Bugs for $0.42 Each using ChatGPT. InProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis(Vienna, Austria)(ISSTA 2024). Association for Computing Machinery, New York, NY, USA, 819–831

  72. [72]

    Jifeng Xuan, Matias Martinez, Favio DeMarco, Maxime Clément, Sebastian Lamelas Marcote, Thomas Durieux, Daniel Le Berre, and Martin Monperrus. 2017. Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs. IEEE Transactions on Software Engineering43, 1 (2017), 34–55

  73. [73]

    Takashi Yagawa, Tadanori Teruya, Kuniyasu Suzaki, and Hirotake Abe. 2024. Delegating Verification for Remote Attestation Using TEE. In2024 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). 186–192

  74. [74]

    He Ye, Matias Martinez, Xiapu Luo, Tao Zhang, and Martin Monperrus. 2023. SelfAPR: Self-supervised Program Repair with Test Execution Diagnostics. InProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering(Rochester, MI, USA)(ASE ’22). Association for Computing Machinery, New York, NY, USA, Article 92, 13 pages

  75. [75]

    He Ye, Matias Martinez, and Martin Monperrus. 2022. Neural program repair with execution-based backpropagation. InProceedings of the 44th International Conference on Software Engineering(Pittsburgh, Pennsylvania)(ICSE ’22). Association for Computing Machinery, New York, NY, USA, 1506–1518

  76. [76]

    Zheng Yu, Ziyi Guo, Yuhang Wu, Jiahao Yu, Meng Xu, Dongliang Mu, Yan Chen, and Xinyu Xing. 2025. PATCHAGENT: A Practical Program Repair Agent Mimicking Human Expertise. In34th USENIX Security Symposium, USENIX Security 2025, Seattle, W A, USA, August 13-15, 2025, Lujo Bauer and Giancarlo Pellegrino (Eds.). USENIX Association, 4381–4400

  77. [77]

    Quanjun Zhang, Chunrong Fang, Yuxiang Ma, Weisong Sun, and Zhenyu Chen. 2023. A Survey of Learning-based Automated Program Repair.ACM Trans. Softw. Eng. Methodol.33, 2, Article 55 (Dec. 2023), 69 pages

  78. [78]

    Quanjun Zhang, Chunrong Fang, Yang Xie, YuXiang Ma, Weisong Sun, Yun Yang, and Zhenyu Chen. 2024. A Systematic Literature Review on Large Language Models for Automated Program Repair. arXiv:2405.01466 [cs.SE]

  79. [79]

    Yuntong Zhang, Andreea Costea, Ridwan Shariffdeen, Davin McCall, and Abhik Roychoudhury. 2025. EffFix: Efficient and Effective Repair of Pointer Manipulating Programs.ACM Trans. Softw. Eng. Methodol.34, 3, Article 69 (Feb. 2025), 27 pages

  80. [80]

    Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, and Abhik Roychoudhury. 2024. AutoCodeRover: Autonomous Program Improvement. InProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (Vienna, Austria)(ISSTA 2024). Association for Computing Machinery, New York, NY, USA, 1592–1604

Showing first 80 references.