IRIS-14B is the first LLM trained explicitly for GIMPLE-to-LLVM IR translation and outperforms much larger models by up to 44 percentage points on real-world C code.
hub
Mlir: Scaling compiler infrastructure for domain specific computation
13 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 4polarities
background 4representative citing papers
Mat2Boundary treats boundary conditions as sparse matrix-vector products and uses multi-stage compilation with polyhedral analysis to generate efficient matrix-free kernels and communication schedules for distributed block-structured PDE solvers.
An MLIR-native NumPy-like DSL with a new dialect-agnostic type checker and parallel-first lowering to a dataflow dialect, shown on weather modeling and CFD workloads in Fortran.
Free-variable sets and a nesting tree can replace dominance relations in SSA for higher-order programs, improving precision without requiring explicit control-flow graphs.
A new abstract interpretation algorithm enables sound optimistic analysis of e-graphs during equality saturation, unifying it with non-destructive rewriting and improving precision on cyclic SSA programs.
LEO performs cross-vendor backward slicing from stalled GPU instructions to attribute root causes to source code, enabling optimizations that produce geometric-mean speedups of 1.73-1.82x on 21 workloads.
EquivFusion unifies equivalence checking across hardware design levels by lowering PyTorch, C/C++, Chisel, Verilog, and netlists via MLIR into SMT-LIB, BTOR2, and AIGER formats.
KEET uses LLM agents to generate data-grounded natural language explanations of performance issues in GPU kernels from Nsight Compute profiles and shows these improve downstream LLM-based optimization tasks.
Aquas delivers a holistic hardware-software co-optimization framework on MLIR that models memory interfaces with cache effects and uses an e-graph retargetable compiler, achieving up to 15.61x speedup with 14.5% area overhead across four domains.
Error analysis and cost estimator for recasting floating-point matrix multiplication as accumulated integer products on mixed-precision hardware.
AutoLALA automatically generates symbolic formulas for reuse distance and data movement complexity in affine loop programs using polyhedral lowering and Barvinok counting.
citing papers explorer
-
LLM Translation of Compiler Intermediate Representation
IRIS-14B is the first LLM trained explicitly for GIMPLE-to-LLVM IR translation and outperforms much larger models by up to 44 percentage points on real-world C code.
-
Mat2Boundary: Treating User-Defined Boundary Condition as SpMV for Distributed PDE Solvers on Block-Structured Grids
Mat2Boundary treats boundary conditions as sparse matrix-vector products and uses multi-stage compilation with polyhedral analysis to generate efficient matrix-free kernels and communication schedules for distributed block-structured PDE solvers.
-
Demonstrating a Future for MLIR-native DSL Compilers on a NumPy-like Example
An MLIR-native NumPy-like DSL with a new dialect-agnostic type checker and parallel-first lowering to a dataflow dialect, shown on weather modeling and CFD workloads in Fortran.
-
SSA without Dominance for Higher-Order Programs
Free-variable sets and a nesting tree can replace dominance relations in SSA for higher-order programs, improving precision without requiring explicit control-flow graphs.
-
Optimism in Equality Saturation
A new abstract interpretation algorithm enables sound optimistic analysis of e-graphs during equality saturation, unifying it with non-destructive rewriting and improving precision on cyclic SSA programs.
-
LEO: Tracing GPU Stall Root Causes via Cross-Vendor Backward Slicing
LEO performs cross-vendor backward slicing from stalled GPU instructions to attribute root causes to source code, enabling optimizations that produce geometric-mean speedups of 1.73-1.82x on 21 workloads.
-
EquivFusion: Unifying Hardware Equivalence Checking from Algorithms to Netlists via MLIR
EquivFusion unifies equivalence checking across hardware design levels by lowering PyTorch, C/C++, Chisel, Verilog, and netlists via MLIR into SMT-LIB, BTOR2, and AIGER formats.
-
KEET: Explaining Performance of GPU Kernels Using LLM Agents
KEET uses LLM agents to generate data-grounded natural language explanations of performance issues in GPU kernels from Nsight Compute profiles and shows these improve downstream LLM-based optimization tasks.
-
Aquas: Enhancing Domain Specialization through Holistic Hardware-Software Co-Optimization based on MLIR
Aquas delivers a holistic hardware-software co-optimization framework on MLIR that models memory interfaces with cache effects and uses an e-graph retargetable compiler, achieving up to 15.61x speedup with 14.5% area overhead across four domains.
-
Analysis of Floating-Point Matrix Multiplication Computed via Integer Arithmetic
Error analysis and cost estimator for recasting floating-point matrix multiplication as accumulated integer products on mixed-precision hardware.
-
AutoLALA: Automatic Loop Algebraic Locality Analysis for AI and HPC Kernels
AutoLALA automatically generates symbolic formulas for reuse distance and data movement complexity in affine loop programs using polyhedral lowering and Barvinok counting.
- SkCC: Portable and Secure Skill Compilation for Cross-Framework LLM Agents
- The EDGE Language: Extended General Einsums for Graph Algorithms