Parallel Sparse and Data-Sparse Factorization-based Linear Solvers
Pith reviewed 2026-05-25 07:02 UTC · model grok-4.3
The pith
Direct solvers remain essential for robust large-scale linear systems via advances in parallel communication reduction and low-rank compression.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Because of their robustness and accuracy, direct solvers are crucial components in building a scalable solver toolchain, and the key recent advances worth highlighting are techniques for communication reduction and low-rank compression in parallel sparse direct solvers.
What carries the argument
Sparse direct solvers that combine task- and data-parallel communication reduction with low-rank and hierarchical matrix algebra compression to handle factorization of large ill-conditioned systems.
If this is right
- Direct solvers can solve ill-conditioned and indefinite equations more efficiently in parallel environments.
- Scalable solver toolchains become feasible for applications in multiphysics, machine learning, and data science.
- High speed and reliability are delivered on heterogeneous parallel machines through targeted parallelization practices.
- Computational complexity drops via low-rank approximations without sacrificing the accuracy of factorization.
Where Pith is reading between the lines
- These advances may expand the range of problems where direct solvers are preferred over iterative alternatives due to guaranteed robustness.
- Implementation details from the review could guide development of hybrid solver libraries that mix direct and other methods.
- The techniques suggest potential extensions to time-dependent or nonlinear problems where repeated factorizations occur.
Load-bearing premise
The reviewed techniques in communication reduction and low-rank compression constitute the key recent advances worth highlighting for parallel sparse direct solvers.
What would settle it
Demonstration on large ill-conditioned systems that communication reduction or low-rank compression techniques fail to improve scalability, accuracy, or reliability of direct solvers compared to prior methods.
Figures
read the original abstract
Efficient solutions of large-scale, ill-conditioned and indefinite algebraic equations are ubiquitously needed in numerous computational fields, including multiphysics simulations, machine learning, and data science. Because of their robustness and accuracy, direct solvers are crucial components in building a scalable solver toolchain. In this chapter, we will review recent advances of sparse direct solvers along two axes: 1) reducing communication and latency costs in both task- and data-parallel settings, and 2) reducing computational complexity via low-rank and other compression techniques such as hierarchical matrix algebra. In addition to algorithmic principles, we also illustrate the key parallelization challenges and best practices to deliver high speed and reliability on modern heterogeneous parallel machines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. This manuscript is a survey chapter reviewing recent advances in sparse direct solvers for large-scale, ill-conditioned, and indefinite linear systems arising in multiphysics simulations, machine learning, and data science. It organizes the review along two axes—reducing communication and latency costs in task- and data-parallel settings, and reducing computational complexity via low-rank and hierarchical matrix compression techniques—while also covering parallelization challenges and best practices for heterogeneous machines. The central observation is that direct solvers remain crucial for robustness and accuracy in scalable solver toolchains.
Significance. If the coverage is balanced and up-to-date, the chapter could serve as a useful reference for practitioners needing robust direct methods. No new theorems, algorithms, empirical results, machine-checked proofs, or reproducible code are presented; significance therefore rests entirely on the quality and representativeness of the literature synthesis rather than on any novel technical contribution.
minor comments (2)
- The abstract states that the two axes constitute 'the key recent advances' but provides no explicit selection criteria or discussion of scope limitations; a short paragraph in the introduction justifying the focus relative to other directions (e.g., hybrid direct-iterative methods) would improve transparency.
- Because the work is a survey rather than a research article, the absence of any tables summarizing complexity, communication volume, or software availability for the reviewed packages is a missed opportunity for clarity; adding such a summary table would aid readers.
Simulated Author's Rebuttal
We thank the referee for the positive review and recommendation to accept. The assessment accurately captures the manuscript's scope as a literature synthesis on communication reduction and data-sparse techniques in sparse direct solvers.
Circularity Check
No significant circularity in survey paper
full rationale
This is a survey chapter reviewing existing advances in sparse direct solvers along axes of communication reduction and low-rank compression. No new theorems, derivations, predictions, fitted parameters, or empirical results are asserted. The strongest claim (direct solvers' robustness) is a standard observation, and the weakest assumption (editorial scope of reviewed techniques) is not a falsifiable premise internal to any derivation chain. No load-bearing steps reduce to self-definition, fitted inputs, or self-citation chains.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
review recent advances of sparse direct solvers along two axes: 1) reducing communication and latency costs... 2) reducing computational complexity via low-rank and other compression techniques such as hierarchical matrix algebra
-
IndisputableMonolith/Foundation/DimensionForcing.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
3D CA algorithm framework... etree... separator tree
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Abdelfattah, A., Beams, N., Carson, R., Ghysels, P., Kolev, T., Stitt, T., uro Vargas, A., Tomov, S., Dongarra, J.: MAGMA: Enabling exascale performance with accelerated BLAS and LAPACK for diverse GPU architectures. The International Journal of High 34 Contents Performance Computing Applications38(5), 468–490 (2024). doi:10.1177/10943420241261960
-
[2]
Neutrino Production via $e^-e^+$ Collision at $Z$-boson Peak
Abdelfattah, A., Ghysels, P., Boukaram, W., Tomov, S., Li, X.S., Dongarra, J.: Addressing irregular patterns of matrix computations on GPUs and their impact on applications powered by sparse direct solvers. In: SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–14. IEEE (2022). doi:10.1109/SC41404.2022.00031
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/sc41404.2022.00031 2022
-
[3]
Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S.: Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects. J. Phys.: Conf. Ser.180(1) (2009)
work page 2009
-
[4]
AHMED.https://www.wr.uni-bayreuth.de/en/software/ahmed/index.html
-
[5]
In: International Conference on High Performance Computing, pp
Al-Harthi, N., Alomairy, R., Akbudak, K., Chen, R., Ltaief, H., Bagci, H., Keyes, D.: Solving acoustic boundary integral equations using high performance tile low-rank LU factorization. In: International Conference on High Performance Computing, pp. 209–229. Springer (2020)
work page 2020
-
[6]
The University of Texas at Austin (2019)
Alger, N.V.: Data-scalable Hessian preconditioning for distributed parameter PDE-constrained inverse problems. The University of Texas at Austin (2019)
work page 2019
-
[7]
In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp
Aliaga, J.I., Carratal ´a-S´aez, R., Kriemann, R., Quintana-Ort ´ı, E.S.: Task-parallel LU factorization of hierarchical matrices using OmpSs. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1148–1157. IEEE (2017)
work page 2017
-
[8]
SIAM Journal on Scientific Computing42(5), A3397–A3426 (2020)
Ambartsumyan, I., Boukaram, W., Bui-Thanh, T., Ghattas, O., Keyes, D., Stadler, G., Turkiyyah, G., Zampini, S.: Hierarchical matrix approximations of Hessians arising in inverse problems governed by PDEs. SIAM Journal on Scientific Computing42(5), A3397–A3426 (2020)
work page 2020
-
[9]
Journal of Scientific Computing57(3), 477–501 (2013)
Ambikasaran, S., Darve, E.: An𝑂(𝑁log𝑁)fast direct solver for partial hierarchically semi-separable matrices: with application to radial basis function interpolation. Journal of Scientific Computing57(3), 477–501 (2013)
work page 2013
-
[10]
The Inverse Fast Multipole Method
Ambikasaran, S., Darve, E.: The inverse fast multipole method. arXiv preprint arXiv:1407.1572 (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[11]
IEEE transactions on pattern analysis and machine intelligence38(2), 252–265 (2015)
Ambikasaran, S., Foreman-Mackey, D., Greengard, L., Hogg, D.W., O’Neil, M.: Fast direct methods for Gaussian processes. IEEE transactions on pattern analysis and machine intelligence38(2), 252–265 (2015)
work page 2015
-
[12]
Fast symmetric factorization of hierarchical matrices with applications
Ambikasaran, S., O’Neil, M., Singh, K.R.: Fast symmetric factorization of hierarchical matrices with applications. arXiv preprint arXiv:1405.0223 (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[13]
Amestoy, P., Ashcraft, C., Boiteau, O., Buttari, A., L ’Excellent, J.Y., Weisbecker, C.: Improving multifrontal methods by means of block low-rank representations. SIAM J. Sci. Comput.37(3), A1451–A1474 (2015)
work page 2015
-
[14]
SIAM Journal on Scientific Computing39(4), A1710–A1740 (2017)
Amestoy, P., Buttari, A., L ’Excellent, J.Y., Mary, T.: On the complexity of the block low-rank multifrontal factorization. SIAM Journal on Scientific Computing39(4), A1710–A1740 (2017)
work page 2017
-
[15]
Amestoy, P.R., Buttari, A., L ’Excellent, J.Y., Mary, T.: Performance and scalability of the block low-rank multifrontal factorization on multicore architectures. ACM Trans. Math. Softw.45(1) (2019). doi:10.1145/3242094
-
[16]
Amestoy, P.R., Buttari, A., L ’Excellent, J.Y., Mary, T.A.: Bridging the gap between flat and hierarchical low-rank matrix formats: the multilevel block low-rank format. SIAM J. Sci. Comput.41(3), A1414–A1442 (2019). doi:10.1137/18M1182760
-
[17]
Amestoy, P.R., Duff, I.S., L ’excellent, J.Y., Li, X.S.: Analysis and comparison of two general sparse solvers for distributed memory computers. ACM Trans. Math. Softw.27(4), 388–421 (2001). doi:10.1145/504210.504212. URLhttps://doi.org/10.1145/ 504210.504212
-
[18]
In: International Workshop on Applied Parallel Computing, pp
Amestoy, P.R., Duff, I.S., L ’Excellent, J.Y., Koster, J.: MUMPS: a general purpose distributed memory sparse solver. In: International Workshop on Applied Parallel Computing, pp. 121–130. Springer (2000)
work page 2000
-
[19]
Amestoy, P.R., Duff, I.S., L ’Excellent, J.Y., Koster, J.: A fully asynchronous multi-frontal solver using distributed dynamic scheduling. SIAM Journal on Matrix Anal. Appl.23, 15–41 (2001). doi:10.1137/S0895479899358194
-
[20]
Aminfar, A., Ambikasaran, S., Darve, E.: A fast block low-rank dense solver with applications to finite-element matrices. J. Comput. Phys.304, 170–188 (2016)
work page 2016
-
[21]
Anderson, E., Saad, Y.: Solving sparse triangular linear systems on parallel computers. Int. J. High Speed Comput.1(1), 73–95 (1989). doi:10.1142/S0129053389000056. URLhttps://doi.org/10.1142/S0129053389000056
-
[22]
Advances in Computational Mathematics49, 1–46 (2021)
Angleitner, N., Faustmann, M., Melenk, J.M.:H-inverses for RBF interpolation. Advances in Computational Mathematics49, 1–46 (2021). URLhttps://api.semanticscholar.org/CorpusID:237540970
work page 2021
-
[23]
ACM Transactions on Mathematical Software48(1), 2:1–2:33 (2022)
Anzt, H., Cojean, T., Flegar, G., G ¨obel, F., Gr¨ utzmacher, T., Nayak, P., Ribizel, T., Tsai, Y.M., Quintana-Ort´ı, E.S.: Ginkgo: a modern linear operator algebra framework for high performance computing. ACM Transactions on Mathematical Software48(1), 2:1–2:33 (2022). doi:10.1145/3480935. URLhttps://doi.org/10.1145/3480935
-
[24]
SIAM Journal on Matrix Analysis and Applications42(2), 990–1010 (2021)
Ashcraft, C., Buttari, A., Mary, T.: Block low-rank matrices with shared bases: potential and limitations of the BLR 2 format. SIAM Journal on Matrix Analysis and Applications42(2), 990–1010 (2021)
work page 2021
-
[25]
ASKIT.https://padas.oden.utexas.edu/libaskit
-
[26]
Starpu: A unified platform for task scheduling on heterogeneous multicore architectures,
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.A.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurrency and Computation: Practice and Experience23(2), 187–198 (2011). doi:10.1002/cpe.1631. URLhttps: //inria.hal.science/inria-00550877
-
[27]
Ballard, G., Carson, E., Demmel, J., Hoemmen, M., Knight, N., Schwartz, O.: Communication lower bounds and optimal algorithms for numerical linear algebra. Acta Numerica23, 1–155 (2014). doi:10.1017/S0962492914000038
-
[28]
Cambridge University Press (2025)
Ballard, G., Kolda, T.G.: Tensor decompositions for data science. Cambridge University Press (2025)
work page 2025
-
[29]
SIAM Journal on Scientific Computing37(4), B519–B542 (2015)
Barnett, A., Wu, B., Veerapaneni, S.: Spectrally accurate quadratures for evaluation of layer potentials close to the boundary for the 2D Stokes and Laplace equations. SIAM Journal on Scientific Computing37(4), B519–B542 (2015)
work page 2015
-
[30]
Numerische Mathematik86(4), 565–589 (2000)
Bebendorf, M.: Approximation of boundary element matrices. Numerische Mathematik86(4), 565–589 (2000). doi:10.1007/PL00005410
-
[31]
Mathematical Methods in the Applied Sciences29(14), 1721–1747 (2006)
Bebendorf, M., Grzhibovskis, R.: Accelerating Galerkin BEM for linear elasticity using adaptive cross approximation. Mathematical Methods in the Applied Sciences29(14), 1721–1747 (2006). doi:10.1002/mma.759
-
[32]
Numerische Mathematik95(1), 1–28 (2003) Contents 35
Bebendorf, M., Hackbusch, W.: Existence ofH-matrix approximants to the inverse FE-matrix of elliptic operators with L∞- coefficients. Numerische Mathematik95(1), 1–28 (2003) Contents 35
work page 2003
-
[33]
Numerische Mathematik 121(4), 609–635 (2012)
Bebendorf, M., Venn, R.: Constructing nested bases approximations from the entries of non-local operators. Numerische Mathematik 121(4), 609–635 (2012)
work page 2012
-
[34]
In: IEEE International Parallel and Distributed Processing Symposium, pp
Belli, R., Hoefler, T.: Notified access: Extending remote memory access programming models for producer-consumer synchronization. In: IEEE International Parallel and Distributed Processing Symposium, pp. 871–881. IEEE (2015)
work page 2015
-
[35]
Bendoraityte, J., B ¨orm, S.: DistributedH 2-matrices for non-local operators. Comput. Vis. Sci11, 237–249 (2008)
work page 2008
-
[36]
Survey of Nearest Neighbor Techniques
Bhatia, N., Vandana: Survey of nearest neighbor techniques (2010). URLhttps://arxiv.org/abs/1007.0085
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[37]
Birdsall, C.K., Langdon, A.B.: Plasma physics via computer simulation. CRC press (2018)
work page 2018
-
[38]
SIAM, Philadelphia (1997).http://www.netlib.org/scalapack
Blackford, L.S., Choi, J., D’ Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D., Whaley, R.C.: ScaLAPACK users’ guide. SIAM, Philadelphia (1997).http://www.netlib.org/scalapack
work page 1997
-
[39]
B ¨orm Steffen Grasedyck, L., Hackbusch, W.: Introduction to hierarchical matrices with application. Eng. Anal. Bound. Elem.27, 405–422 (2003)
work page 2003
-
[40]
Computing and Visualization in Science16, 247–258 (2013)
B ¨orm, S., Reimer, K.: Efficient arithmetic operations for rank-structured matrices based on hierarchical low-rank updates. Computing and Visualization in Science16, 247–258 (2013). doi:10.1007/s00791-014-0236-7
-
[41]
Parallel Comput.38(1–2), 37–51 (2012)
Bosilca, G., Bouteiller, A., Danalis, A., Herault, T., Lemarinier, P., Dongarra, J.: DAGuE: A generic distributed DAG engine for High Performance Computing. Parallel Comput.38(1–2), 37–51 (2012). doi:10.1016/j.parco.2011.10.003. URLhttps: //icl.utk.edu/parsec/
-
[42]
The International Journal of High Performance Computing Applications38(6), 585–598 (2024)
Boukaram, W., Hong, Y., Liu, Y., Shi, T., Li, X.S.: Batched sparse direct solver design and evaluation in SuperLU DIST. The International Journal of High Performance Computing Applications38(6), 585–598 (2024). doi:10.1177/10943420241268200
-
[43]
arXiv preprint arXiv:2509.11152 (2025)
Boukaram, W., Keyes, D., Li, S., Liu, Y., Turkiyyah, G.: Linear complexityH 2 direct solver for fine-grained parallel architectures. arXiv preprint arXiv:2509.11152 (2025)
-
[44]
Boukaram, W., Liu, Y., Ghysels, P., Li, X.S.: Adaptive Sketching Based Construction of H2 Matrices on GPUs. In: The 26th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2025). IEEE (2025). Best Paper Award
work page 2025
-
[45]
ACM Transactions on Mathematical Software45(1), 1–28 (2019)
Boukaram, W.H., Turkiyyah, G., Keyes, D.: Hierarchical matrix operations on GPUs: Matrix-vector multiplication and compression. ACM Transactions on Mathematical Software45(1), 1–28 (2019). doi:10.1145/3232850
-
[46]
Bradley, A.M.: A Hybrid Multithreaded Direct Sparse Triangular Solver, pp. 13–22. doi:10.1137/1.9781611974690.ch2
-
[47]
B ¨orm, S.: DirectionalH 2-matrix compression for high-frequency problems. Numer. Linear Algebra Appl.24(6), e2112 (2017). doi:10.1002/nla.2112
-
[48]
SIAM Journal on Matrix Analysis and Applications41(2), 715–746 (2020)
Cambier, L., Chen, C., Boman, E.G., Rajamanickam, S., Tuminaro, R.S., Darve, E.: An algebraic sparsified nested dissection algorithm using low-rank approximations. SIAM Journal on Matrix Analysis and Applications41(2), 715–746 (2020)
work page 2020
-
[49]
Cand `es, E., Demanet, L., Ying, L.: A fast butterfly algorithm for the computation of Fourier integral operators. SIAM Multiscale Model. Simul.7(4), 1727–1750 (2009)
work page 2009
-
[50]
Cao, Q., Pei, Y., Herault, T., Akbudak, K., Mikhalev, A., Bosilca, G., Ltaief, H., Keyes, D., Dongarra, J.: Performance analysis of tile low-rank Cholesky factorization using PARSEC instrumentation tools. In: 2019 IEEE/ACM International Workshop on Programming and Performance Visualization Tools (ProTools), pp. 25–32. IEEE (2019)
work page 2019
-
[51]
An Efficient Solver for Sparse Linear Systems Based on Rank-Structured Cholesky Factorization
Chadwick, J.N., Bindel, D.S.: An efficient solver for sparse linear systems based on rank-structured Cholesky factorization. arXiv preprint arXiv:1507.05593 (2015)
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[52]
SIAM Journal on Matrix Analysis and Applications29(1), 67–81 (2007)
Chandrasekaran, S., Dewilde, P., Gu, M., Lyons, W., Pals, T.: A fast solver for HSS representations via sparse matrices. SIAM Journal on Matrix Analysis and Applications29(1), 67–81 (2007)
work page 2007
-
[53]
Chandrasekaran, S., Dewilde, P., Gu, M., Somasunderam, N.: On the numerical rank of the off-diagonal blocks of Schur complements of discretized elliptic PDEs. SIAM Journal on Matrix Anal. Appl.31, 2261–2290 (2010). doi:10.1137/090775932
-
[54]
SIAM Journal on Matrix Analysis and Applications28(3), 603–622 (2006)
Chandrasekaran, S., Gu, M., Pals, T.: A fast ULV decomposition solver for hierarchically semiseparable representations. SIAM Journal on Matrix Analysis and Applications28(3), 603–622 (2006)
work page 2006
-
[55]
In: 2020 IEEE International parallel and distributed processing symposium (IPDPS), pp
Ch ´avez, G., Liu, Y., Ghysels, P., Li, X.S., Rebrova, E.: Scalable and memory-efficient kernel ridge regression. In: 2020 IEEE International parallel and distributed processing symposium (IPDPS), pp. 956–965. IEEE (2020)
work page 2020
-
[56]
Chen, C., Martinsson, P.G.: Solving linear systems on a GPU with hierarchically off-diagonal low-rank approximations. In: SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–15. IEEE (2022)
work page 2022
-
[57]
In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp
Chenhan, D.Y., March, W.B., Biros, G.: An𝑛log𝑛parallel fast direct solver for kernel matrices. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 886–896. IEEE (2017)
work page 2017
-
[58]
In: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp
Chenhan, D.Y., March, W.B., Xiao, B., Biros, G.: INV-ASKIT: a parallel fast direct solver for kernel matrices. In: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 161–171. IEEE (2016)
work page 2016
-
[59]
Claus, L., Ghysels, P., Boukaram, W.H., Li, X.S.: A graphics processing unit accelerated sparse direct solver and preconditioner with block low rank compression
-
[60]
ACM Transactions on Mathematical Software49(3), 1–28 (2023)
Claus, L., Ghysels, P., Liu, Y., Nhan, T.A., Thirumalaisamy, R., Bhalla, A.P.S., Li, S.: Sparse approximate multifrontal factorization with composite compression methods. ACM Transactions on Mathematical Software49(3), 1–28 (2023)
work page 2023
-
[61]
Applied and Computational Harmonic Analysis38(2), 284–317 (2015)
Corona, E., Martinsson, P.G., Zorin, D.: An𝑂(𝑁)direct solver for integral equations on the plane. Applied and Computational Harmonic Analysis38(2), 284–317 (2015)
work page 2015
-
[62]
Corona, E., Rahimian, A., Zorin, D.: A tensor-train accelerated solver for integral equations in complex geometries. J. Comput. Phys. 334, 145–169 (2017)
work page 2017
-
[63]
SIAM Journal on Scientific Computing39(3), A761–A796 (2017)
Coulier, P., Pouransari, H., Darve, E.: The inverse fast multipole method: using a fast approximate direct solver as a preconditioner for dense linear systems. SIAM Journal on Scientific Computing39(3), A761–A796 (2017)
work page 2017
-
[64]
3—an unsymmetric-pattern multifrontal method
Davis, T.A.: Algorithm 832: UMFPACK V4. 3—an unsymmetric-pattern multifrontal method. ACM Trans. Math. Softw.30(2), 196–199 (2004)
work page 2004
-
[65]
In: 2016 ieee international parallel and distributed processing symposium (ipdps), pp
Di, S., Cappello, F.: Fast error-bounded lossy HPC data compression with SZ. In: 2016 ieee international parallel and distributed processing symposium (ipdps), pp. 730–739. IEEE (2016) 36 Contents
work page 2016
-
[66]
DINFMM.https://github.com/Tianyu-Liang/DINFMM/tree/main
-
[67]
Ding, N., Liu, Y., Williams, S., Li, X.S.: A message-driven, multi-GPU parallel sparse triangular solver. In: Proceedings of the 2021 SIAM Conference on Applied and Computational Discrete Algorithms (ACDA21), pp. 147–159. doi:10.1137/1.9781611976830.14
-
[68]
Ding, N., Williams, S., Liu, Y., Li, X.S.: Leveraging one-sided communication for sparse triangular solvers. In: Proceedings of the 2020 SIAM Conference on Parallel Processing for Scientific Com puting (PP), pp. 93–105. doi:10.1137/1.9781611976137.9
-
[69]
Duff, I.S.: Ma57—a code for the solution of sparse symmetric definite and indefinite systems. ACM Trans. Math. Softw.30(2), 118–144 (2004). doi:10.1145/992200.992202. URLhttps://doi.org/10.1145/992200.992202
-
[70]
Engquist, B., Ying, L.: Fast directional multilevel algorithms for oscillatory kernels. SIAM J. Sci. Comput.29(4), 1710–1737 (2007)
work page 2007
-
[71]
Communications on Pure and Applied Mathematics71(11), 2220–2274 (2018)
Engquist, B., Zhao, H.: Approximate separability of the Green’s function of the Helmholtz equation in the high frequency limit. Communications on Pure and Applied Mathematics71(11), 2220–2274 (2018)
work page 2018
-
[72]
Mathematics of Computation85(297), 119–152 (2016)
Faustmann, M., Melenk, J., Praetorius, D.: Existence ofH-matrix approximants to the inverses of BEM matrices: The simple-layer operator. Mathematics of Computation85(297), 119–152 (2016)
work page 2016
-
[73]
Faustmann, M., Melenk, J.M., Praetorius, D.: A new proof for existence of H-matrix approximants to the inverse of FEM matrices: the Dirichlet problem for the Laplacian. In: Spectral and High Order Methods for Partial Differential Equations-ICOSAHOM 2012: Selected papers from the ICOSAHOM conference, June 25-29, 2012, Gammarth, Tunisia, pp. 249–259. Spring...
work page 2012
-
[74]
Ima Journal of Numerical Analysis37, 1211–1244 (2015)
Faustmann, M., Melenk, J.M., Praetorius, D.: Existence ofH-matrix approximants to the inverse of BEM matrices: the hyper- singular integral operator. Ima Journal of Numerical Analysis37, 1211–1244 (2015). URLhttps://api.semanticscholar.org/ CorpusID:116945940
work page 2015
-
[75]
Faustmann, M., Melenk, J.M., Praetorius, D.:H-matrix approximability of the inverses of FEM matrices. Numer. Math.131(4), 615–642 (2015). doi:10.1007/s00211-015-0706-9. URLhttps://doi.org/10.1007/s00211-015-0706-9
-
[76]
In: Workshop on Fast Direct Solvers
Faverge, M., Pichon, G., Ramet, P., Roman, J.: On the use of H-matrix arithmetic in PaStiX: a preliminary study. In: Workshop on Fast Direct Solvers. Toulouse, France (2015). URLhttps://inria.hal.science/hal-01187882
work page 2015
-
[77]
Communications in Mathematical Sciences18(1), 91–108 (2020)
Feliu-Fab `a, J., Ho, K.L., Ying, L.: Recursively preconditioned hierarchical interpolative factorization for elliptic partial differential equations. Communications in Mathematical Sciences18(1), 91–108 (2020)
work page 2020
-
[78]
Feliu-Fab `a, J., Ying, L.: Approximate inversion of discrete Fourier integral operators. J. Comput. Phys.446, 110654 (2021)
work page 2021
-
[79]
Feng, Y., Xiao, J., Gu, M.: Flip-flop spectrum-revealing QR factorization and its applications to singular value decomposition. Electron. Trans. Numer. Anal.51, 469–494 (2019). doi:10.1553/etna vol51s469
-
[80]
Fu, X., Zhang, B., Wang, T., Li, W., Lu, Y., Yi, E., Zhao, J., Geng, X., Li, F., Zhang, J., Jin, Z., Liu, W.: PanguLU: a scalable regular two-dimensional block-cyclic sparse direct solver on distributed heterogeneous systems. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’23. Associati...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.