pith. sign in

arxiv: 1906.08613 · v1 · pith:6RWQPB62new · submitted 2019-06-20 · 💻 cs.MS

Program Generation for Linear Algebra Using Multiple Layers of DSLs

Pith reviewed 2026-05-25 19:07 UTC · model grok-4.3

classification 💻 cs.MS
keywords program generationdomain specific languageslinear algebraBLASLAPACKcode generationnumerical libraries
0
0 comments X

The pith

Domain-specific generator with multiple DSL layers creates tailored linear algebra routines

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that a program generator built from multiple layers of domain-specific languages can produce linear algebra library routines customized to an application's exact sizes, interfaces, and target architecture. Standard libraries such as BLAS and LAPACK provide portable performance but fall short on flexibility for specific needs. By using layered DSLs, the generator allows creation of routines that fit the application precisely rather than forcing the application to adapt to the library. A reader would care if this leads to higher performance and easier development in computational applications. The approach directly addresses the limitations of fixed libraries by enabling on-demand generation of optimized code.

Core claim

We advocate a domain-specific program generator capable of producing library routines tailored to the specific needs of the application in terms of sizes, interface, and target architecture.

What carries the argument

A program generator employing multiple layers of domain-specific languages to synthesize linear algebra routines.

Load-bearing premise

Limitations in the flexibility of existing libraries such as BLAS and LAPACK can be effectively overcome by a domain-specific program generator using multiple layers of DSLs.

What would settle it

Demonstration that for common application scenarios the generated code does not achieve the required customization or performance levels compared to standard approaches.

Figures

Figures reproduced from arXiv: 1906.08613 by (2) RWTH Aachen University), Daniele G. Spampinato (1), Diego Fabregat-Traver (2), Markus P\"uschel (1), Paolo Bientinesi (2) ((1) ETH Zurich.

Figure 1
Figure 1. Figure 1: Structure of our linear algebra generator. The left and right columns [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Performance results for: (a) XT u Xu = A, (b) LXs + XsLT = S, and (c) LX + XU = C. All matrices ∈ Rn×n; A, L, S, U, and C are inputs, X∗ are outputs; L is lower triangular, U, Xu are upper triangular, S, Xs are symmetric, and A is symmetric positive definite. In (a) f ≈ n 3 3 flops while in (b)–(c) f ≈ 2n 3 flops. Tests compiled with icc v.16 and run on an Intel Sandy Bridge (AVX, 32 kB L1-D cache, 256 kB … view at source ↗
read the original abstract

Numerical software in computational science and engineering often relies on highly-optimized building blocks from libraries such as BLAS and LAPACK, and while such libraries provide portable performance for a wide range of computing architectures, they still present limitations in terms of flexibility. We advocate a domain-specific program generator capable of producing library routines tailored to the specific needs of the application in terms of sizes, interface, and target architecture.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 1 minor

Summary. The paper claims that libraries such as BLAS and LAPACK provide portable performance across architectures but suffer from limitations in flexibility with respect to problem sizes, interfaces, and target architectures. It advocates the development of a domain-specific program generator that employs multiple layers of DSLs to automatically produce customized linear algebra library routines tailored to specific application needs.

Significance. If successfully realized, the advocated multi-layer DSL generator could meaningfully improve the adaptability and performance of numerical software in computational science and engineering by enabling architecture- and application-specific code generation beyond what static libraries currently allow. The position aligns with established trends in program generation for high-performance computing.

minor comments (1)
  1. The manuscript consists solely of a high-level advocacy statement with no concrete examples, pseudocode, or discussion of specific DSL layers, making it difficult to assess the practicality of the proposed approach.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our position paper and for the recommendation of minor revision. No specific major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is an advocacy/position document whose central claim is a recommendation to use multi-layer DSL program generators for producing tailored linear algebra routines. No equations, derivations, fitted parameters, or formal uniqueness theorems appear in the text. The argument remains at the conceptual level of motivation and does not reduce any prediction or result to its own inputs by construction, self-citation chains, or ansatz smuggling. The work is therefore self-contained against external benchmarks with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract; no free parameters, axioms, or invented entities are specified in the available text.

pith-pipeline@v0.9.0 · 5618 in / 897 out tokens · 19551 ms · 2026-05-25T19:07:32.443193+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

11 extracted references · 11 canonical work pages

  1. [1]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

  2. [2]

    11em plus .33em minus .07em 4000 4000 100 4000 4000 500 `\.=1000 = #1 \@IEEEnotcompsoconly \@IEEEcompsoconly #1 * [1] 0pt [0pt][0pt] #1 * [1] 0pt [0pt][0pt] #1 * \| ** #1 \@IEEEauthorblockNstyle \@IEEEcompsocnotconfonly \@IEEEauthorblockAstyle \@IEEEcompsocnotconfonly \@IEEEcompsocconfonly \@IEEEauthordefaulttextstyle \@IEEEcompsocnotconfonly \@IEEEauthor...

  3. [3]

    J. J. Dongarra et al . A set of level 3 basic linear algebra subprograms. ACM Trans. on Mathematical Software (TOMS), 16 0 (1): 0 1--17, 1990

  4. [4]

    Anderson et al

    E. Anderson et al . LAPACK Users' Guide . Society for Industrial and Applied Mathematics, third edition, 1999

  5. [5]

    Bientinesi, J

    P. Bientinesi, J. A. Gunnels, M. E. Myers, E. S. Quintana-Ort\' i , and R. A. van de Geijn. The science of deriving dense linear algebra algorithms. ACM Trans. on Mathematical Software (TOMS), 31 0 (1): 0 1--26, 2005

  6. [6]

    Fabregat-Traver and P

    D. Fabregat-Traver and P. Bientinesi. Automatic Generation of Loop-Invariants for Matrix Operations. In Computational Science and Its Applications (ICCSA), pp. 82--92, 2011

  7. [7]

    Fabregat-Traver and P

    D. Fabregat-Traver and P. Bientinesi. Knowledge-Based Automatic Generation of Partitioned Matrix Expressions. In Computer Algebra in Scientific Computing (CASC), vol. 6885 of Lecture Notes in Computer Science (LNCS), pp. 144--157. Springer, 2011

  8. [8]

    P \"u schel, F

    M. P \"u schel, F. Franchetti, and Y. Voronenko. Encyclopedia of Parallel Computing, chap. Spiral. Springer, 2011

  9. [9]

    D. G. Spampinato and M. P \"u schel. A basic linear algebra compiler. In Code Generation and Optimization (CGO), pp. 23--32, 2014

  10. [10]

    D. G. Spampinato and M. P \"u schel. A basic linear algebra compiler for structured matrices. In Code Generation and Optimization (CGO), pp. 117--127, 2016

  11. [11]

    C. Bastoul. Code generation in the polyhedral model is easier than you think. In Parallel Architectures and Compilation Techniques (PACT), pp. 7--16, 2004