SemOpt: LLM-Driven Code Optimization via Rule-Based Analysis

Qianyu Xiao; Yingfei Xiong; Yuan-An Xiao; Yuwei Zhao; Zhao Zhang

arxiv: 2510.16384 · v2 · pith:4YXZHENBnew · submitted 2025-10-18 · 💻 cs.SE

SemOpt: LLM-Driven Code Optimization via Rule-Based Analysis

Yuwei Zhao , Yuan-An Xiao , Qianyu Xiao , Zhao Zhang , Yingfei Xiong This is my paper

classification 💻 cs.SE

keywords codeoptimizationsemoptanalysisllmspythonstrategyexamples

0 comments

read the original abstract

Automated code optimization improves program performance through refactoring, and recent studies leverage LLMs for this purpose. Existing approaches mine optimization commits from open-source codebases to build large-scale knowledge bases, then employ retrieval techniques such as BM25 to obtain relevant examples for hotspot code, guiding LLMs in optimization. However, semantically equivalent optimizations often appear in syntactically dissimilar code, so current retrieval methods fail to identify pertinent examples, leading to suboptimal results. To address these limitations, we propose SemOpt, a framework that leverages static program analysis to identify code segments, retrieve optimization strategies, and generate optimized results. SemOpt has three LLM-powered components: (1) a strategy library builder that extracts and clusters strategies from code modifications, (2) a rule generator that produces Semgrep static analysis rules to capture each strategy's applicability, and (3) an optimizer that generates optimized code using the strategy library. On a benchmark of 151 C/C++ and 150 Python optimization tasks, SemOpt shows consistent improvements across different LLMs, increasing successful optimizations by 1.38 to 28 times on C/C++ and 4.60 to 6.33 times on Python versus the baseline. On large-scale projects, SemOpt improves performance metrics by 5.04% to 218.07% on C/C++ and 61.77% to 479.90% on Python, showing cross-language generalization and practical effectiveness.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Lean Refactor: Multi-Objective Controllable Proof Optimization via Agentic Strategy Search
cs.LO 2026-05 unverdicted novelty 6.0

Lean Refactor uses retrieval from a curated multi-objective strategy database to guide frozen LLMs in refactoring Lean proofs, reporting over 70% token compression on benchmarks and improved version transfer.