pith. sign in

arxiv: 2510.16384 · v2 · pith:4YXZHENBnew · submitted 2025-10-18 · 💻 cs.SE

SemOpt: LLM-Driven Code Optimization via Rule-Based Analysis

classification 💻 cs.SE
keywords codeoptimizationsemoptanalysisllmspythonstrategyexamples
0
0 comments X
read the original abstract

Automated code optimization improves program performance through refactoring, and recent studies leverage LLMs for this purpose. Existing approaches mine optimization commits from open-source codebases to build large-scale knowledge bases, then employ retrieval techniques such as BM25 to obtain relevant examples for hotspot code, guiding LLMs in optimization. However, semantically equivalent optimizations often appear in syntactically dissimilar code, so current retrieval methods fail to identify pertinent examples, leading to suboptimal results. To address these limitations, we propose SemOpt, a framework that leverages static program analysis to identify code segments, retrieve optimization strategies, and generate optimized results. SemOpt has three LLM-powered components: (1) a strategy library builder that extracts and clusters strategies from code modifications, (2) a rule generator that produces Semgrep static analysis rules to capture each strategy's applicability, and (3) an optimizer that generates optimized code using the strategy library. On a benchmark of 151 C/C++ and 150 Python optimization tasks, SemOpt shows consistent improvements across different LLMs, increasing successful optimizations by 1.38 to 28 times on C/C++ and 4.60 to 6.33 times on Python versus the baseline. On large-scale projects, SemOpt improves performance metrics by 5.04% to 218.07% on C/C++ and 61.77% to 479.90% on Python, showing cross-language generalization and practical effectiveness.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Lean Refactor: Multi-Objective Controllable Proof Optimization via Agentic Strategy Search

    cs.LO 2026-05 unverdicted novelty 6.0

    Lean Refactor uses retrieval from a curated multi-objective strategy database to guide frozen LLMs in refactoring Lean proofs, reporting over 70% token compression on benchmarks and improved version transfer.