pith. sign in

arxiv: 2604.25812 · v1 · submitted 2026-04-28 · 💻 cs.CY

Hands-on PDC in Undergraduate Computing Education

Pith reviewed 2026-05-07 14:24 UTC · model grok-4.3

classification 💻 cs.CY
keywords parallel and distributed computingundergraduate educationhigh-performance computingproject-based learningmultithreadingOpenMPmatrix multiplication
0
0 comments X p. Extension

The pith

Structured access to real HPC infrastructure deepens undergraduate understanding of parallelism and multithreading

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that undergraduate computer science students benefit from direct, hands-on experience with real high-performance computing systems when learning parallel and distributed computing. It details an assignment where students use the HiPerGator supercomputer to code and test parallel matrix multiplication in Python and C, dealing with practical issues like scheduling and tuning. Three years of evaluations indicate that this structured access leads to better understanding and retention of concepts like multithreading compared to standard theoretical instruction. This matters because PDC is a core but difficult part of the curriculum, and practical projects help solidify abstract ideas.

Core claim

The central claim is that engaging students directly with the University of Florida's HiPerGator supercomputer to implement and benchmark matrix multiplication using Python and C via POSIX threads and OpenMP, while navigating batch scheduling and performance tuning, deepens their understanding of parallelism and multithreading, as shown by evaluations across three years of course offerings.

What carries the argument

A detailed practical assignment for implementing parallel matrix multiplication on a real supercomputer with Python, POSIX threads, and OpenMP, including batch job management and core allocation.

If this is right

  • Students retain knowledge of parallelism more effectively through project-based HPC work.
  • Direct experience with multithreading tools improves conceptual grasp beyond lectures.
  • Real HPC infrastructure can be integrated into undergraduate courses to enhance learning outcomes.
  • Performance benchmarking activities highlight the practical benefits of parallelization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar hands-on assignments could be developed using cloud HPC resources for institutions without local supercomputers.
  • The approach may extend to other areas of computing education such as distributed systems or machine learning on clusters.
  • Comparative studies with simulated environments would help isolate the value of real hardware access.

Load-bearing premise

That the positive changes in student understanding across the three-year period are caused by the hands-on HPC experience and not by other course components, motivation, or novelty.

What would settle it

If students in a version of the course without the supercomputer assignment demonstrate equivalent improvements in understanding of parallelism and multithreading, the link to the hands-on experience would be called into question.

Figures

Figures reproduced from arXiv: 2604.25812 by Anas Gamal Aly, Hala ElAarag.

Figure 1
Figure 1. Figure 1: Performance of C with OpenMP. Figures 1 and 2 show a student’s comparison of C-based implementations. In their reports, students identified the C implementations as significantly faster than Python — as expected. They noted that the OpenMP version ( view at source ↗
Figure 2
Figure 2. Figure 2: Performance of C with POSIX threads. The slight performance differences and diminishing returns at higher thread counts led students to independently research and discuss concepts like thread creation overhead and the limits of parallelization. A key insight for many students was the unexpected performance behavior of multithreaded Python. As shown in view at source ↗
Figure 3
Figure 3. Figure 3: Performance of multithreaded Python view at source ↗
read the original abstract

Parallel and Distributed Computing (PDC) is a critical yet conceptually challenging area of the undergraduate computer science curriculum. While students often encounter these concepts in theory, few gain exposure to experience in real high-performance computing (HPC) environments. Research shows that when students are engaged in project-based learning they retain knowledge more effectively. They also develop a deeper understanding of concepts taught in the classroom. This paper presents a practical assignment in which students engage directly with the University of Florida's HiPerGator supercomputer to implement and benchmark matrix multiplication using Python and C (via POSIX threads and OpenMP). Students navigate batch scheduling, core allocation, and performance tuning, experiences that are rarely accessible at the undergraduate level. We describe the assignment in detail and provide a three-year evaluation across multiple course offerings, highlighting how structured access to real HPC infrastructure can deepen student understanding of parallelism and multithreading.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper describes a hands-on assignment in which undergraduate CS students access the University of Florida HiPerGator supercomputer to implement and benchmark matrix multiplication in Python and C using POSIX threads and OpenMP, including batch scheduling, core allocation, and performance tuning. It reports a three-year evaluation across multiple course offerings and claims that structured access to real HPC infrastructure deepens student understanding of parallelism and multithreading.

Significance. If the evaluation were strengthened to demonstrate causal impact, the work would offer a replicable model for incorporating authentic HPC resources into undergraduate PDC curricula, addressing a documented gap between theoretical coverage and practical experience. The detailed assignment description itself provides immediate value for instructors seeking to adopt similar projects.

major comments (2)
  1. [Evaluation section (and abstract)] The central claim that the HPC assignment deepens understanding of parallelism and multithreading rests on the three-year evaluation, yet the manuscript provides no description of assessment instruments (e.g., concept inventories), control cohorts without the HPC component, sample sizes, pre/post measures, or statistical methods to isolate effects from other course elements, student selection, or novelty.
  2. [Evaluation section] The abstract and evaluation narrative assert improved outcomes across offerings but do not report longitudinal tracking of specific learning objectives or adjustment for confounders such as instructor changes or maturation effects, leaving the attribution to the hands-on PDC experience unsupported.
minor comments (2)
  1. [Assignment description] The assignment description would benefit from explicit pseudocode or code snippets for the matrix-multiplication kernels to aid replication.
  2. [Evaluation section] Clarify whether the three-year span involved the same instructor and course structure or whether other variables changed concurrently.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive review. The comments on the evaluation section are well taken, and we have revised the manuscript to provide a more transparent description of our methods while moderating the strength of our claims to match the observational nature of the data collected.

read point-by-point responses
  1. Referee: [Evaluation section (and abstract)] The central claim that the HPC assignment deepens understanding of parallelism and multithreading rests on the three-year evaluation, yet the manuscript provides no description of assessment instruments (e.g., concept inventories), control cohorts without the HPC component, sample sizes, pre/post measures, or statistical methods to isolate effects from other course elements, student selection, or novelty.

    Authors: We appreciate this observation. The three-year evaluation draws on anonymous end-of-course surveys containing Likert-scale and open-ended items that asked students to rate their understanding of parallelism and multithreading before versus after completing the assignment, together with review of submitted code, benchmark outputs, and instructor observations of student engagement during lab sessions. We have added a new subsection to the Evaluation section that describes the survey instrument, lists representative questions, reports response rates (approximately 75 % across offerings), and gives the per-offering sample sizes (ranging from 42 to 61 students). Because the assignment was a required component of the course, no parallel control cohort existed. No formal concept inventories, pre/post testing battery, or statistical modeling to isolate effects were employed. In the revised version we have updated both the abstract and the evaluation narrative to characterize the evidence as consistent positive student self-reports and successful project outcomes rather than as causally isolated effects. revision: yes

  2. Referee: [Evaluation section] The abstract and evaluation narrative assert improved outcomes across offerings but do not report longitudinal tracking of specific learning objectives or adjustment for confounders such as instructor changes or maturation effects, leaving the attribution to the hands-on PDC experience unsupported.

    Authors: We agree that the original text did not adequately address these issues. All three offerings were taught by the same instructor, which we now state explicitly; we have also added a short discussion of other potential confounders, including gradual increases in students' prior exposure to parallel programming tools and the inherent maturation of the course materials themselves. No longitudinal tracking of individual students across years was performed, as each cohort was distinct. We have revised the abstract and evaluation narrative to replace stronger causal phrasing with more cautious language that reports observed patterns of student feedback and project success without claiming definitive attribution solely to the HPC component. revision: yes

Circularity Check

0 steps flagged

No circularity: descriptive educational report with no derivations or self-referential predictions

full rationale

The paper is a descriptive report of an educational assignment involving matrix multiplication on HiPerGator using Python/C with POSIX threads/OpenMP, plus a three-year evaluation across course offerings. It contains no mathematical derivations, equations, fitted parameters, or predictions that reduce to inputs by construction. General references to project-based learning research are not self-citations and do not bear the central claim. The evaluation is presented as observational outcomes without any fitted models or uniqueness theorems imported from prior author work. The derivation chain is absent; the paper is self-contained as standard educational reporting against external benchmarks of assignment descriptions and student feedback.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical report on a teaching intervention. No free parameters, axioms, or invented entities are used.

pith-pipeline@v0.9.0 · 5444 in / 1058 out tokens · 68766 ms · 2026-05-07T14:24:35.194496+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

15 extracted references · 3 canonical work pages

  1. [1]

    Teaching PDC in the time of covid: hands-on ma- terials for remote learning

    Joel C Adams et al. “Teaching PDC in the time of covid: hands-on ma- terials for remote learning”. In:2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE. 2021, pp. 342–349

  2. [2]

    Seeing Is Believing: Helping Students Visualize Multithreaded Behavior

    Joel C. Adams et al. “Seeing Is Believing: Helping Students Visualize Multithreaded Behavior”. In:Proceedings of the 47th ACM Technical Symposium on Computing Science Education. SIGCSE ’16. Memphis, Tennessee, USA: Association for Computing Machinery, 2016, pp. 473– 478.isbn:9781450336857.doi:10.1145/2839509.2844557.url:https: //doi.org/10.1145/2839509.2844557

  3. [3]

    Discovery learning in computer science

    Doug Baldwin. “Discovery learning in computer science”. In:Proceedings of the twenty-seventh SIGCSE technical symposium on Computer science education. 1996, pp. 222–226

  4. [4]

    Teaching Parallel Programming on the CPU Based on Matrix Multiplication Using MKL, OpenMP and SYCL Libraries

    Emilia Bober and Beata Bylina. “Teaching Parallel Programming on the CPU Based on Matrix Multiplication Using MKL, OpenMP and SYCL Libraries”. In:Proceedings of the 17th International Conference on Com- puter Supported Education - Volume 2: CSEDU. INSTICC. SciTePress, 2025,pp.713–720.isbn:978-989-758-746-7.doi:10.5220/0013279100003932

  5. [5]

    Supercomputers and supercomputing

    Jeffrey Cook. “Supercomputers and supercomputing”. In:Visual Analyt- ics and Interactive Technologies: Data, Text and Web Mining Applica- tions. IGI Global, 2011, pp. 282–294

  6. [6]

    A Hands-on Approach to Teaching Operating Systems through Building a Cluster Using Raspberry Pi’s

    Hala ElAarag. “A Hands-on Approach to Teaching Operating Systems through Building a Cluster Using Raspberry Pi’s”. In:Journal of Com- puting Sciences in Colleges38.5 (2022), pp. 31–41

  7. [7]

    Deeper learning in computer science education using raspberry pi

    Hala ElAarag. “Deeper learning in computer science education using raspberry pi”. In:Journal of Computing Sciences in Colleges33.2 (2017), pp. 161–170

  8. [8]

    Student Outcomes in Parallelizing Recursive Matrix Multiply

    Chris Fietkiewicz. “Student Outcomes in Parallelizing Recursive Matrix Multiply”. In:Journal of Computational Science Education(2019). 9

  9. [9]

    A survey on hardware accelerators for large lan- guage models

    Christoforos Kachris. “A survey on hardware accelerators for large lan- guage models”. In:Applied Sciences15.2 (2025), p. 586

  10. [10]

    Foster- ing a creative interest in computer science

    Gary Lewandowski, Elizabeth Johnson, and Michael Goldweber. “Foster- ing a creative interest in computer science”. In:ACM SIGCSE Bulletin 37.1 (2005), pp. 535–539

  11. [11]

    Project based learning in computer science–a review of more than 500 projects

    Robert Pucher and Martin Lehner. “Project based learning in computer science–a review of more than 500 projects”. In:Procedia-Social and Be- havioral Sciences29 (2011), pp. 1561–1566

  12. [12]

    On the Complexity of Matrix Multiplication

    Andrew James Stothers. “On the Complexity of Matrix Multiplication”. PhD thesis. University of Edinburgh, 2010

  13. [13]

    UFIT Research Computing, University of Florida.HiPerGator: Univer- sity of Florida Research Supercomputing Resources.http : / / www . rc . ufl.edu

  14. [14]

    Attention is all you need

    Ashish Vaswani et al. “Attention is all you need”. In:Advances in neural information processing systems30 (2017)

  15. [15]

    Scalable matmul-free language modeling,

    Rui-Jie Zhu et al. “Scalable matmul-free language modeling”. In:arXiv preprint arXiv:2406.02528(2024). 10