KernelBenchX benchmark shows task category explains nearly three times more variance in LLM kernel correctness than method choice, iterative refinement boosts correctness but reduces performance, and quantization remains unsolved.
Stark: Strategic team of agents for refining kernels.arXiv preprint arXiv:2510.16996
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 2years
2026 2representative citing papers
AscendOptimizer combines kernel rewinding for reusable experience with evolutionary search on hardware feedback to optimize Ascend NPU operators, delivering 1.21x geometric-mean speedup and faster performance on 53.47% of 101 tested operators versus baseline.
citing papers explorer
-
KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels
KernelBenchX benchmark shows task category explains nearly three times more variance in LLM kernel correctness than method choice, iterative refinement boosts correctness but reduces performance, and quantization remains unsolved.
-
AscendOptimizer: Episodic Agent for Ascend NPU Operator Optimization
AscendOptimizer combines kernel rewinding for reusable experience with evolutionary search on hardware feedback to optimize Ascend NPU operators, delivering 1.21x geometric-mean speedup and faster performance on 53.47% of 101 tested operators versus baseline.