pith. sign in

arxiv: 0911.3456 · v2 · pith:QDUCQCCVnew · submitted 2009-11-18 · 💻 cs.DC · cs.SE

PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation

classification 💻 cs.DC cs.SE
keywords computingperformancepycudapyopencltechniquearticlecodegeneration
0
0 comments X p. Extension
pith:QDUCQCCV Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{QDUCQCCV}

Prints a linked pith:QDUCQCCV badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

High-performance computing has recently seen a surge of interest in heterogeneous systems, with an emphasis on modern Graphics Processing Units (GPUs). These devices offer tremendous potential for performance and efficiency in important large-scale applications of computational science. However, exploiting this potential can be challenging, as one must adapt to the specialized and rapidly evolving computing environment currently exhibited by GPUs. One way of addressing this challenge is to embrace better techniques and develop tools tailored to their needs. This article presents one simple technique, GPU run-time code generation (RTCG), along with PyCUDA and PyOpenCL, two open-source toolkits that support this technique. In introducing PyCUDA and PyOpenCL, this article proposes the combination of a dynamic, high-level scripting language with the massive performance of a GPU as a compelling two-tiered computing platform, potentially offering significant performance and productivity advantages over conventional single-tier, static systems. The concept of RTCG is simple and easily implemented using existing, robust infrastructure. Nonetheless it is powerful enough to support (and encourage) the creation of custom application-specific tools by its users. The premise of the paper is illustrated by a wide range of examples where the technique has been applied with considerable success.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Lya2pcf: an efficient pipeline to estimate two- and three-point correlation functions of the Lyman-$\alpha$ forest

    astro-ph.CO 2025-06 unverdicted novelty 6.0

    Lya2pcf is an efficient pipeline implementing standard algorithms for 2PCF and 3PCF of the Lyman-alpha forest, with GPU speedups over PICCA and the first large-sample anisotropic 3PCF measurement up to 80 Mpc/h.