Employing Continuous Integration inspired workflows for benchmarking of scientific software -- a use case on numerical cut cell quadrature
Pith reviewed 2026-05-22 22:42 UTC · model grok-4.3
The pith
Continuous Integration tools automate benchmark execution and reporting for scientific software even as test designs evolve.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Established Continuous Integration tools and practices achieve high automation of benchmark execution and reporting for scientific software. The approach handles the rapid expansion of the parameter space and the common need to add new libraries, adapt metrics, or introduce new benchmark cases without requiring laborious re-evaluation of all prior results by hand.
What carries the argument
Continuous Integration inspired workflows that treat benchmark definitions, executions, and reports as versioned, automatically triggered tasks.
If this is right
- Adding a new software package requires only the definition of its build and run steps; all prior results are then re-executed and compared automatically.
- Changes to metrics or addition of new test geometries trigger fresh runs and updated reports without separate manual intervention.
- Results remain reproducible because every benchmark step is captured in the same version-controlled scripts used by the CI system.
- The same workflow can compare packages that use fundamentally different discretizations on the same set of implicit or parametric domains.
Where Pith is reading between the lines
- The same pattern could reduce effort in other fields where competing numerical codes must be re-tested after each code or test-suite update.
- Public CI configurations could serve as living benchmark suites that external contributors extend through ordinary pull requests.
- Over time the collected data might reveal systematic performance differences between discretization families that are not visible in single-paper comparisons.
Load-bearing premise
Benchmark designs for scientific software will keep changing after the project starts, forcing repeated full evaluations of all packages and cases.
What would settle it
A side-by-side log showing that the total person-hours spent on manual benchmark maintenance for an evolving set of cut-cell quadrature cases is lower than the hours spent configuring and maintaining the equivalent Continuous Integration workflow.
Figures
read the original abstract
In the field of scientific computing, one often finds several alternative software packages (with open or closed source code) for solving a specific problem. These packages sometimes even use alternative methodological approaches, e.g., different numerical discretizations. If one decides to use one of these packages, it is often not clear which one is the best choice. To make an informed decision, it is necessary to measure the performance of the alternative software packages for a suitable set of test problems, i.e. to set up a benchmark. However, setting up benchmarks ad-hoc can become overwhelming as the parameter space expands rapidly. Very often, the design of the benchmark is also not fully set at the start of some project. For instance, adding new libraries, adapting metrics, or introducing new benchmark cases during the project can significantly increase complexity and necessitate laborious re-evaluation of previous results. This paper presents a proven approach that utilizes established Continuous Integration tools and practices to achieve high automation of benchmark execution and reporting. Our use case is the numerical integration (quadrature) on arbitrary domains, which are bounded by implicitly or parametrically defined curves or surfaces in 2D or 3D.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that established Continuous Integration (CI) tools and practices can achieve high automation of benchmark execution and reporting for scientific software, even when benchmark designs evolve during a project (e.g., by adding libraries, adapting metrics, or introducing new cases). The use case is numerical quadrature on arbitrary 2D/3D domains bounded by implicitly or parametrically defined curves or surfaces.
Significance. If the CI-based workflow is shown to deliver the claimed automation, the paper would offer a practical methodological contribution for reproducible and maintainable benchmarking in scientific computing, addressing a common pain point of manual re-evaluation as requirements change.
major comments (1)
- [Abstract] Abstract: the central claim that the approach is 'proven' and achieves 'high automation' for the quadrature use case is not supported by any implementation details, quantitative results, or verification of reduced manual effort; this is load-bearing for the methodological contribution.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. The single major comment concerns support for claims in the abstract. We address it point-by-point below and propose targeted revisions where appropriate.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that the approach is 'proven' and achieves 'high automation' for the quadrature use case is not supported by any implementation details, quantitative results, or verification of reduced manual effort; this is load-bearing for the methodological contribution.
Authors: The full manuscript (Sections 2–4) supplies concrete implementation details: we describe the CI pipeline configuration (YAML workflows, job matrices for 2D/3D quadrature cases, integration with cut-cell libraries, and automated result aggregation via scripts that regenerate tables and plots on each commit). The quadrature use case explicitly demonstrates handling of evolving benchmarks—adding a new library, metric, or geometry—without manual re-execution of prior cases, which is shown through before/after workflow diagrams and example repository commits. We acknowledge that the paper does not include quantitative time-and-effort measurements (e.g., person-hours before vs. after CI adoption); such metrics are inherently project-specific and were outside the scope of the methodological contribution. We will revise the abstract to replace 'proven' with 'demonstrated via a detailed use case' and add a short paragraph in the conclusions discussing the qualitative reduction in manual re-evaluation. revision: partial
Circularity Check
No significant circularity; methodological workflow description only
full rationale
The paper presents a CI-inspired workflow for automating benchmark execution and reporting in scientific computing, demonstrated on cut-cell quadrature. No derivations, equations, fitted parameters, or mathematical predictions exist in the text. The central claim is a practical methodology whose validity rests on implementation experience rather than any self-referential reduction, self-citation chain, or ansatz. All steps are descriptive and externally verifiable through the described tools and practices, with no load-bearing element that collapses to its own inputs by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
M. D. Wilkinson, M. Dumontier, I. J. Aalbersberg, G. Appleton, M. Axton, A. Baak, N. Blomberg, J. W. Boiten, L. B. da Silva San- tos, P. E. Bourne, et al., The fair guiding principles for scientific data management and stewardship, Scientific Data 3 (2016) 160018. doi:10.1038/sdata.2016.18
-
[2]
E. Soares, G. Sizilio, J. Santos, D. A. Da Costa, U. Kulesza, The effects of continuous integration on software development: a system- atic literature review, Empirical Software Engineering 27 (3) (2022) 78. doi:10.1007/s10664-021-10114-1
-
[3]
A. Dubey, K. Weide, D. Lee, J. Bachan, C. Daley, S. Olofin, N. Taylor, P.M.Rich, L.B.Reid, Ongoingverificationofamultiphysicscommunity code: FLASH,Software: PracticeandExperience45(2)(2015)233–244. doi:10.1002/spe.2220
-
[4]
A. Dubey, Good practices for high-quality scientific computing, Com- puting in Science & Engineering 24 (6) (2022) 72–76.doi:10.1109/ MCSE.2023.3259259
-
[5]
D. E. Bernholdt, A. Dubey, P. Grubel, Better scientific software tu- torial at the international conference for high-performance comput- 26 ing, networking, storage, and analysis (sc22) (October 2022). doi: 10.6084/m9.figshare.21384057.v3
-
[6]
T. Marić, D. Gläser, J.-P. Lehr, I. Papagiannidis, B. Lambie, C. Bischof, D. Bothe, A pragmatic workflow for research software engineering in computational science, arXiv:2310.00960 [cs] (October 2023). arXiv: 2310.00960, doi:10.48550/arXiv.2310.00960
-
[7]
N. U. Eisty, U. Kanewala, J. C. Carver, Testing research software: an in-depth survey of practices, methods, and tools, Empirical Software Engineering 30 (3) (2025) 81.doi:10.1007/s10664-025-10620-6
-
[8]
B. Marussig, M. Loibl, T. Toprak, F. Kummer, G. H. Teixeira, B2- m/cutelementintegration: v1.0.3 (April 2025). doi:10.5281/zenodo. 15294335
-
[9]
B. Müller, F. Kummer, M. Oberlack, Highly accurate surface and vol- ume integration on implicit domains by means of moment-fitting, Inter- national Journal for Numerical Methods in Engineering 96 (8) (2013) 512–528.doi:10.1002/nme.4569
-
[10]
N. Zander, T. Bog, M. Elhaddad, R. Espinoza, H. Hu, A. Joly, C. Wu, P. Zerbe, A. Düster, S. Kollmannsberger, J. Parvizian, M. Ruess, D. Schillinger, E. Rank, Fcmlab: A finite cell research toolbox for MATLAB, Advances in Engineering Software 74 (2014) 49–63.doi: 10.1016/j.advengsoft.2014.04.004
-
[11]
R. I. Saye, High-order quadrature methods for implicitly defined surfaces and volumes in hyperrectangles, SIAM Journal on Scientific Computing 37 (2) (2015) A993–A1019.doi:10.1137/140966290
-
[12]
F. Kummer, Extended discontinuous Galerkin methods for two-phase flows: the spatial discretization, International Journal for Numerical Methods in Engineering 109 (2) (2017) 259–289. doi:10.1002/nme. 5288
work page doi:10.1002/nme 2017
-
[13]
D. Gunderman, K. Weiss, J. A. Evans, High-accuracy mesh-free quadra- ture for trimmed parametric surfaces and volumes, Computer-Aided De- sign 141 (2021) 103093.doi:10.1016/j.cad.2021.103093. 27
-
[14]
M. Meßmer, T. Teschemacher, L. F. Leidinger, R. Wüchner, K.-U. Blet- zinger, Efficient cad-integrated isogeometric analysis of trimmed solids, Computer Methods in Applied Mechanics and Engineering 400 (2022) 115584. doi:10.1016/j.cma.2022.115584
-
[15]
R. I. Saye, High-order quadrature on multi-component domains im- plicitly defined by multivariate polynomials, Journal of Computational Physics 448 (2022) 110720.doi:10.1016/j.jcp.2021.110720
-
[16]
R. I. Saye, Algoim: Algorithms for implicitly defined geometry, level set methods, and voronoi implicit interface methods,https://github. com/algoim/algoim, accessed: 2025-03-31 (2022)
work page 2025
-
[17]
Chair of Fluid Dynamics of Technical University Darmstadt, Bosss - the bounded support spectral solver,https://github.com/FDYdarmstadt/ BoSSS, accessed: 2025-03-31 (2024)
work page 2025
-
[18]
N. Zander, et al., Fcmlab: A finite cell research toolbox for matlab, https://gitlab.lrz.de/cie_sam_public/fcmlab, accessed: 2025-03- 31 (2024)
work page 2025
-
[20]
G. Project, Ginkgo: Numerical linear algebra software package,https: //github.com/ginkgo-project/ginkgo, accessed: 2025-03-31 (2024)
work page 2025
-
[21]
S. C. Divi, C. V. Verhoosel, F. Auricchio, A. Reali, E. H. Van Brum- melen, Error-estimate-based adaptive integration for immersed isogeo- metric analysis, Computers & Mathematics with Applications 80 (11) (2020) 2481–2516.doi:10.1016/j.camwa.2020.03.026
-
[22]
S. C. Divi, P. H. van Zuijlen, T. Hoang, F. de Prenter, F. Auricchio, A. Reali, E. H. van Brummelen, C. V. Verhoosel, Residual-based error estimation and adaptivity for stabilized immersed isogeometric analysis using truncated hierarchical B-splines, Journal of Mechanics 38 (2022) 204–237.doi:10.1093/jom/ufac015. 28
-
[23]
Evalf, Nutils: Free and open source python programming library for finite element method computations,https://github.com/evalf/ nutils, accessed: 2025-03-31 (2025)
work page 2025
-
[24]
C. Lehrenfeld, High order unfitted finite element methods on level set domains using isoparametric mappings, Computer Methods in Applied Mechanics and Engineering 300 (2016) 716–733.doi:10.1016/j.cma. 2015.12.005
-
[25]
A Higher Order Isoparametric Fictitious Domain Method for Level Set Domains
C. Lehrenfeld, A higher order isoparametric fictitious domain method for level set domains, arXiv:1612.02561 [cs, math] (2017).arXiv:1612. 02561, doi:10.48550/arXiv.1612.02561
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1612.02561 2017
-
[26]
C. Lehrenfeld, F. Heimann, J. Preuß, H. von Wahl, ngsxfem: An add- on library to the finite element package netgen/ngsolve which enables the use of unfitted finite element technologies,https://github.com/ ngsxfem/ngsxfem, accessed: 2025-03-31 (2021)
work page 2025
-
[27]
D. Gunderman, Quahog: Quadrature for high-order geometries,https: //github.com/davidgunderman/QuaHOG, accessed: 2025-03-31 (2021)
work page 2025
-
[28]
G. Teixeira, M. Loibl, B. Marussig, Comparison of integration methods for cut elements, in: ECCOMAS 2024, 2024. URL https://www.scipedia.com/public/Teixeira_et_al_2024b 29
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.