Towards Automated Kernel Generation in the Era of LLMs

Chi Hsu Tsai; Chunlei Men; Guang Liu; Haiming Wu; Haoyu Wang; Jialing Zhang; Jingze Shi; Peiyu Zang; Wentao Zhang; Yang Yu

arxiv: 2601.15727 · v3 · pith:ABLK63ZYnew · submitted 2026-01-22 · 💻 cs.LG · cs.CL

Towards Automated Kernel Generation in the Era of LLMs

Yang Yu , Peiyu Zang , Chi Hsu Tsai , Haiming Wu , Yixin Shen , Jialing Zhang , Haoyu Wang , Zhiyou Xiao

show 6 more authors

Jingze Shi Yuyu Luo Wentao Zhang Chunlei Men Guang Liu Yonghua Lin

This is my paper

classification 💻 cs.LG cs.CL

keywords kernelgenerationoptimizationagenticapproachesautomatedexpert-levelfield

0 comments

read the original abstract

The performance of modern AI systems is fundamentally constrained by the quality of their underlying GPU kernels, which translate high-level algorithmic semantics into low-level hardware operations. Achieving near-optimal kernels requires expert-level understanding of hardware architectures and programming models, making kernel engineering a critical but notoriously time-consuming and non-scalable process. Recent advances in large language models and LLM-based agents have opened new possibilities for automating kernel generation and optimization. LLMs are well-suited to compress expert-level kernel knowledge that is difficult to formalize, while agentic systems further enable scalable optimization by casting kernel development as an iterative, feedback-driven loop. Rapid progress has been made in this area. However, the field remains fragmented and lacks a systematic perspective for LLM-driven kernel generation. This survey addresses this gap by providing a structured overview of existing approaches, spanning LLM-based approaches and agentic optimization workflows, and systematically organizing the datasets and benchmarks that underpin learning and evaluation in this domain. Moreover, key open challenges and future research directions are further outlined, aiming to establish a comprehensive reference for the next generation of automated kernel optimization. To keep track of this field, we maintain an open-source GitHub repository at https://github.com/flagos-ai/awesome-LLM-driven-kernel-generation.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels
cs.LG 2026-05 unverdicted novelty 7.0

KernelBench-X benchmark shows task category predicts LLM kernel correctness better than method choice, iterative refinement trades performance for higher success rates, and correctness does not ensure efficiency gains...
KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels
cs.LG 2026-05 conditional novelty 7.0

KernelBenchX benchmark shows task category explains nearly three times more variance in LLM kernel correctness than method choice, iterative refinement boosts correctness but reduces performance, and quantization rema...
KEET: Explaining Performance of GPU Kernels Using LLM Agents
cs.PF 2026-05 unverdicted novelty 5.0

KEET uses LLM agents to generate data-grounded natural language explanations of performance issues in GPU kernels from Nsight Compute profiles and shows these improve downstream LLM-based optimization tasks.