State-of-The-Art Sparse Direct Solvers

Kiran Gullapalli; Matthias Bollh\"ofer; Olaf Schenk; Radim Janal\'ik; Steve Hamm

arxiv: 1907.05309 · v1 · pith:LUQZJTIInew · submitted 2019-07-11 · 💻 cs.DS

State-of-The-Art Sparse Direct Solvers

Matthias Bollh\"ofer , Olaf Schenk , Radim Janal\'ik , Steve Hamm , Kiran Gullapalli This is my paper

Pith reviewed 2026-05-24 22:53 UTC · model grok-4.3

classification 💻 cs.DS

keywords sparse direct solverspreprocessingfill-in reductioncircuit simulationBLAS kernelsparallel factorizationcombinatorial algorithms

0 comments

The pith

Preprocessing with combinatorial algorithms and dense BLAS kernels has significantly advanced sparse direct solvers for circuit simulation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reviews modern sparse elimination methods that rely on a preprocessing phase using combinatorial algorithms. These steps improve diagonal dominance, reduce fill-in during factorization, and increase opportunities for parallel execution. The methods also identify dense submatrices that can then be processed efficiently with multithreaded level-3 BLAS routines. The authors illustrate these advances through examples drawn from circuit simulation problems and conclude that the combination of techniques has produced substantial performance gains in recent years.

Core claim

Modern sparse direct solvers rely on a preprocessing phase based on combinatorial algorithms which improve diagonal dominance, reduce fill-in, and improve concurrency to allow for parallel treatment, while detecting dense submatrices that are handled by dense matrix kernels based on multithreaded level-3 BLAS; these combined improvements have advanced direct solution methods significantly for problems arising from circuit simulation.

What carries the argument

The preprocessing phase based on combinatorial algorithms that improves diagonal dominance, reduces fill-in, and enables parallelism, together with the detection and handling of dense submatrices via multithreaded level-3 BLAS kernels.

If this is right

Larger circuit simulation problems become solvable within practical time and memory limits.
Parallel execution on multi-core processors improves scalability of the factorization phase.
Hybrid dense-sparse workflows become standard for handling matrices with localized dense blocks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same preprocessing ideas could transfer to other application areas that generate large sparse matrices, such as finite-element structural analysis.
Hardware-specific tuning of the BLAS kernels might yield further gains on accelerators or GPUs.
Integration with iterative refinement or mixed-precision arithmetic could extend the approach to even larger systems.

Load-bearing premise

The chosen preprocessing algorithms and dense kernel strategies represent the current state of the art and have delivered significant advances in practice.

What would settle it

Direct timing comparisons on the same circuit simulation matrices showing that the reviewed preprocessing and kernel combinations produce no measurable reduction in solve time or memory use compared with earlier direct solvers.

read the original abstract

In this chapter we will give an insight into modern sparse elimination methods. These are driven by a preprocessing phase based on combinatorial algorithms which improve diagonal dominance, reduce fill-in, and improve concurrency to allow for parallel treatment. Moreover, these methods detect dense submatrices which can be handled by dense matrix kernels based on multithreaded level-3 BLAS. We will demonstrate for problems arising from circuit simulation, how the improvements in recent years have advanced direct solution methods significantly.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Review chapter summarizing preprocessing and BLAS use in sparse direct solvers, with no new results or controlled benchmarks.

read the letter

This is a review chapter on modern sparse direct solvers. It covers preprocessing steps that target diagonal dominance, fill-in reduction, and concurrency for parallel work, then switches dense submatrices to multithreaded level-3 BLAS kernels, using circuit simulation matrices as the running example to show recent progress. Nothing here is new; the text is explicitly framed as an overview of existing methods with no fresh algorithms, derivations, or experiments. What it does well is give a compact, practical description of the current pipeline. The explanation of how combinatorial preprocessing feeds into parallel execution and dense kernels is straightforward and matches what practitioners actually use. The soft spot is the central claim that improvements have advanced direct methods significantly. That rests on the authors' choice of which algorithms to highlight and on cited performance numbers from prior work, not on any new side-by-side runs that control for ordering, hardware, or older baselines. Without fresh controlled data the size of the advance stays hard to verify independently. This is for readers who need a quick orientation to the state of sparse direct solvers, especially those working on circuit problems or similar large sparse systems. Someone already active in the area will find little they do not already know from the cited papers. It is the sort of curated summary that belongs in a handbook, so it deserves a serious referee to check the balance and accuracy of the selection. I would send it to review rather than desk reject.

Referee Report

1 major / 0 minor

Summary. The manuscript is a review chapter describing modern sparse direct solvers. It focuses on a preprocessing phase using combinatorial algorithms to improve diagonal dominance, reduce fill-in, and enhance concurrency for parallelism, along with detection of dense submatrices handled via multithreaded level-3 BLAS kernels. The authors state they will demonstrate significant recent advances in direct methods for circuit-simulation problems.

Significance. If the review provides an accurate synthesis of recent preprocessing and kernel techniques with clear links to performance gains on circuit matrices, it could serve as a useful reference for the sparse linear algebra community. However, as a descriptive survey without new experiments or parameter-free derivations, its significance is limited to curation and exposition rather than advancing the state of the art.

major comments (1)

[Abstract] Abstract: the central claim that 'the improvements in recent years have advanced direct solution methods significantly' for circuit-simulation problems is supported only by the authors' selection of preprocessing algorithms and cited performance numbers; no new controlled benchmarks (controlling for ordering, hardware, or baseline solvers) are described, leaving the magnitude of the advance unverifiable from the manuscript.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our review chapter. We address the single major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'the improvements in recent years have advanced direct solution methods significantly' for circuit-simulation problems is supported only by the authors' selection of preprocessing algorithms and cited performance numbers; no new controlled benchmarks (controlling for ordering, hardware, or baseline solvers) are described, leaving the magnitude of the advance unverifiable from the manuscript.

Authors: The manuscript is explicitly a review chapter whose purpose is to synthesize and curate recent literature on preprocessing techniques for sparse direct solvers, with emphasis on circuit-simulation matrices. The abstract claim is therefore grounded in the performance results reported in the cited primary sources rather than in any new experiments conducted by the authors. Because the work is not a research contribution presenting original benchmarks, we did not include new controlled experiments. We can, however, revise the abstract to make the review nature of the claim more explicit and to add a sentence directing readers to the specific cited papers that contain the original benchmark data. revision: partial

Circularity Check

0 steps flagged

No circularity; descriptive review without derivations, predictions, or self-referential reductions

full rationale

The paper is a review chapter describing existing preprocessing algorithms (diagonal dominance, fill-in reduction, concurrency) and dense BLAS kernels for sparse direct solvers, with a focus on circuit simulation matrices. No equations, fitted parameters, uniqueness theorems, or new predictions appear in the provided text or abstract. The claim of 'significant advances' rests on curation of prior methods rather than any derivation that reduces to the paper's own inputs by construction. No self-citation load-bearing steps, ansatz smuggling, or renaming of known results are present. This is the expected outcome for a survey paper whose content is self-contained as description.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a survey paper; no new free parameters, axioms, or invented entities are introduced.

pith-pipeline@v0.9.0 · 5608 in / 877 out tokens · 17665 ms · 2026-05-24T22:53:41.563995+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages

[1]

A. Aho, J. Hopcroft, and J. Ullman.Data structures and algorithms. Addison-Wesley, 1983

work page 1983
[2]

Anapproximateminimumdegreeorderingalgorithm

P.Amestoy,T.A.Davis,andI.S.Duﬀ. Anapproximateminimumdegreeorderingalgorithm. SIAM J. Matrix Anal. Appl., 17(4):886–905, 1996

work page 1996
[3]

McKenney, S

E.Anderson,Z.Bai,C.Bischof,J.Demmel,J.Dongarra,J.D.Croz,A.Greenbaum,S.Ham- marling, A. McKenney, S. Ostrouchov, and D. Sorensen.LAPACK Users’ Guide, Second Edition. SIAM Publications, 1995

work page 1995
[4]

M.Benzi,J.Haws,andM.Tuma.Preconditioninghighlyindeﬁniteandnonsymmetricmatrices. SIAM J. Sci. Comput., 22(4):1333–1353, 2000. 28 M. Bollhöfer, O. Schenk, R. Janalík, Steve Hamm, and K. Gullapalli

work page 2000
[5]

C. Berge. Two theorems in graph theory. InProceedings of National Academy of Science, pages 842–844, USA, 1957

work page 1957
[6]

J. D. Booth, N. D. Ellingwoodb, and S. R. Heidi K. Thornquist. Basker: Parallel sparse lu factorizationutilizinghierarchicalparallelismanddatalayouts. ParallelComputing,68:17–31, 2017

work page 2017
[7]

Gpu-acceleratedsparselufactorizationforcircuitsim- ulation with performance modeling.IEEE TrAnsactions on Parallel and Distributed Systems, 26:786–795, 2015

X.Chen,L.Ren,Y.Wang,andH.Yang. Gpu-acceleratedsparselufactorizationforcircuitsim- ulation with performance modeling.IEEE TrAnsactions on Parallel and Distributed Systems, 26:786–795, 2015

work page 2015
[8]

X. Chen, Y. Wang, and H. Yang. Nicslu: An adaptive sparse matrix solver for parallel circuit simulation. IEEETransactionsonComputer-AidedDesignofIntegratedCircuitsandSystems , 32:261–274, 2013

work page 2013
[9]

Chevalier and F

C. Chevalier and F. Pellegrini. PT-SCOTCH: a tool for eﬃcient parallel graph ordering. Parallel Comput., 34(6–8):318–331, 2008

work page 2008
[10]

ACM, 1969

E.CuthillandJ.McKee.Reducingthebandwidthofsparsesymmetricmatrices.In Proceedings of the 24th national conference of the ACM. ACM, 1969

work page 1969
[11]

Davis and K

T. Davis and K. Stanley. Sparse LU factorization of circuit simulation matrices. InNumerical Aspects of Circuit and Device Modeling Workshop, June 2004

work page 2004
[12]

T. A. Davis.Direct Methods for Sparse Linear Systems. SIAM Publications, 2006

work page 2006
[13]

ACM SIGNUM Newslett., 20:2–18, 1985

D.DodsonandJ.G.Lewis.Issuesrelatingtoextensionofthebasiclinearalgebrasubprograms. ACM SIGNUM Newslett., 20:2–18, 1985

work page 1985
[14]

J. J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson. Issues relating to extension of the basic linear algebra subprograms.ACM SIGNUM Newslett., 20:2–18, 1985

work page 1985
[15]

I. S. Duﬀ, A. Erisman, and J. Reid.Direct Methods for Sparse Matrices. Oxford University Press, 1986

work page 1986
[16]

I. S. Duﬀ and J. Koster. The design and use of algorithms for permuting large entries to the diagonal of sparse matrices.SIAM J. Matrix Anal. Appl., 20:889–901, 1999

work page 1999
[17]

I. S. Duﬀ and S. Pralet. Strategies for scaling and pivoting for sparse symmetric indeﬁnite problems. Technical Report TR/PA/04/59, CERFACS, Toulouse, France, 2004

work page 2004
[18]

Alinear-timeheuristicforimprovingnetworkpartitions

C.M.FiducciaandR.M.Mattheyses. Alinear-timeheuristicforimprovingnetworkpartitions. InProceedings of the 19th Design Automation Conference, pages 175âĂ“–181. IEEE, 1997

work page 1997
[19]

M. Fiedler. A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory.Czecheslovak Mathematical Journal, 25(100):619–633, 1975

work page 1975
[20]

George and J

A. George and J. W. H. Liu. The evolution of the minimum degree ordering algorithm.SIAM Review, 31:1–19, 1989

work page 1989
[21]

J. A. George and J. W. Liu.Computer Solution of Large Sparse Positive Deﬁnite Systems. Prentice-Hall, Englewood Cliﬀs, NJ, USA, 1981

work page 1981
[22]

J. A. George and E. Ng. An implementation of Gaussian elimination with partial pivoting for sparse systems.SIAM J. Sci. Statist. Comput., 6(2):390–409, 1985

work page 1985
[23]

Predictingstructureinsparsematrixcomputations

J.R.Gilbert. Predictingstructureinsparsematrixcomputations. SIAMJ.MatrixAnal.Appl. , 15(1):162–79, 1994

work page 1994
[24]

Gupta and L

A. Gupta and L. Ying. On algorithms for ﬁnding maximum matchings in bipartite graphs. Technical Report RC 21576 (97320), IBM T. J. Watson Research Center, Yorktown Heights, NY, October 25, 1999

work page 1999
[25]

Karypis and V

G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientiﬁc Computing, 20(1):359–392, 1998

work page 1998
[26]

Karypis, K

G. Karypis, K. Schloegel, and V. Kumar.ParMeTis: Parallel Graph Partitioning and Sparse Matrix Ordering Library, Version 2.0. University of Minnesota, Dept. of Computer Science, September 1999

work page 1999
[27]

Kernighan and S

B. Kernighan and S. Lin. An eﬃcient heuristic procedure for partitioning graphs.The Bell System Technical Journal, 29(2):291–307, 1970

work page 1970
[28]

OnparallelpushâĂ“relabelbasedalgorithmsforbipartite maximum matching.Parallel Comput., 40(7):289–308, 2014

J.Langguth,A.Azad,andF.Manne. OnparallelpushâĂ“relabelbasedalgorithmsforbipartite maximum matching.Parallel Comput., 40(7):289–308, 2014

work page 2014
[29]

Langguth, M

J. Langguth, M. M. A. Patwary, and F. Manne. Parallel algorithms for bipartite matching problems on distributed memory computers.Parallel Comput., 37(12):820–845, 2011. State-of-The-Art Sparse Direct Solvers 29

work page 2011
[30]

LaSalle and G

D. LaSalle and G. Karypis. Multi–threaded graph partitioning. Technical report, Department of Computer Science & Engineering, University of Minnesota, Minneapolis, 2013

work page 2013
[31]

W.-K. Lee, R. Achar, and M. S. Nakhla. Dynamic gpu parallel sparse lu factorization for fast circuit simulation.IEEE Transactions on Very Large Scale Integration (VLSI) Systems, pages 1–12, 2018

work page 2018
[32]

X. S. Li. An overview of SuperLU: Algorithms, implementation, and user interface.ACM Trans. Math. Softw., 31(3):302–325, 2005

work page 2005
[33]

J. W. Liu. The role of elimination trees in sparse factorization.SIAM J. Matrix Anal. Appl., 11(1):134–172, 1990

work page 1990
[34]

J. W. Liu and A. Sherman. Comparative analysis of the Cuthill–McKee and the reverse Cuthill–McKee ordering algorithms for sparse matrices.SIAM J. Numer. Anal., 13:198–213, 1976

work page 1976
[35]

H. M. Markowitz. The elimination form of the inverse and its application to linear program- ming. Management Science, 3:255–269, April 1957

work page 1957
[36]

SPICE2:acomputerprogramtosimulatesemiconductorcircuits

L.W.Nagel. SPICE2:acomputerprogramtosimulatesemiconductorcircuits. Memorandum No. ERL-M520, University of California, Berkeley, California, May 1975

work page 1975
[37]

Olschowka and A

M. Olschowka and A. Neumaier. A new pivoting strategy for Gaussian elimination.Linear Algebra and its Applications, 240:131–151, 1996

work page 1996
[38]

Pellegrini, J

F. Pellegrini, J. Roman, and P. Amestoy. Hybridizing nested dissection and halo approximate minimum degree for eﬃcient sparse matrix ordering.Concurrency: Practice and Experience, 12:69–84, 2000

work page 2000
[39]

Rewienski

M. Rewienski. A perspective on fast-spice simulation technology. Springer, Dordrecht, 2011

work page 2011
[40]

Agraph-theoreticstudyofthenumericalsolutionofsparsepositivedeﬁnitesystems of linear equations

D.J.Rose. Agraph-theoreticstudyofthenumericalsolutionofsparsepositivedeﬁnitesystems of linear equations. In R. C. Read, editor,Graph Theory and Computing. Academic Press, 1972

work page 1972
[41]

O. Schenk. Scalable parallel sparse LU factorization methods on shared memory multipro- cessors. PhD thesis, ETH Zurich, 2000. Diss. Technische Wissenschaften ETH Zurich, Nr. 13515, 2000

work page 2000
[42]

Schenk and K

O. Schenk and K. Gärtner. Solving unsymmetric sparse systems of linear equations with PARDISO. Journal of Future Generation Computer Systems, 20(3):475–487, 2004

work page 2004
[43]

R. E. Tarjan. Data structures and network algorithms. InCBMS–NSF Regional Conference Series in Applied Mathematics, volume 44, 1983

work page 1983
[44]

X.Zhao,L.Han,andZ.Feng. Aperformance-guidedgraphsparsiﬁcationapproachtoscalable and robust spice-accurate integrated circuit simulations.IEEE Transactions on Computer- Aided Design of Integrated Circuits and Systems, 34:1639–1651, 2015

work page 2015

[1] [1]

A. Aho, J. Hopcroft, and J. Ullman.Data structures and algorithms. Addison-Wesley, 1983

work page 1983

[2] [2]

Anapproximateminimumdegreeorderingalgorithm

P.Amestoy,T.A.Davis,andI.S.Duﬀ. Anapproximateminimumdegreeorderingalgorithm. SIAM J. Matrix Anal. Appl., 17(4):886–905, 1996

work page 1996

[3] [3]

McKenney, S

E.Anderson,Z.Bai,C.Bischof,J.Demmel,J.Dongarra,J.D.Croz,A.Greenbaum,S.Ham- marling, A. McKenney, S. Ostrouchov, and D. Sorensen.LAPACK Users’ Guide, Second Edition. SIAM Publications, 1995

work page 1995

[4] [4]

M.Benzi,J.Haws,andM.Tuma.Preconditioninghighlyindeﬁniteandnonsymmetricmatrices. SIAM J. Sci. Comput., 22(4):1333–1353, 2000. 28 M. Bollhöfer, O. Schenk, R. Janalík, Steve Hamm, and K. Gullapalli

work page 2000

[5] [5]

C. Berge. Two theorems in graph theory. InProceedings of National Academy of Science, pages 842–844, USA, 1957

work page 1957

[6] [6]

J. D. Booth, N. D. Ellingwoodb, and S. R. Heidi K. Thornquist. Basker: Parallel sparse lu factorizationutilizinghierarchicalparallelismanddatalayouts. ParallelComputing,68:17–31, 2017

work page 2017

[7] [7]

Gpu-acceleratedsparselufactorizationforcircuitsim- ulation with performance modeling.IEEE TrAnsactions on Parallel and Distributed Systems, 26:786–795, 2015

X.Chen,L.Ren,Y.Wang,andH.Yang. Gpu-acceleratedsparselufactorizationforcircuitsim- ulation with performance modeling.IEEE TrAnsactions on Parallel and Distributed Systems, 26:786–795, 2015

work page 2015

[8] [8]

X. Chen, Y. Wang, and H. Yang. Nicslu: An adaptive sparse matrix solver for parallel circuit simulation. IEEETransactionsonComputer-AidedDesignofIntegratedCircuitsandSystems , 32:261–274, 2013

work page 2013

[9] [9]

Chevalier and F

C. Chevalier and F. Pellegrini. PT-SCOTCH: a tool for eﬃcient parallel graph ordering. Parallel Comput., 34(6–8):318–331, 2008

work page 2008

[10] [10]

ACM, 1969

E.CuthillandJ.McKee.Reducingthebandwidthofsparsesymmetricmatrices.In Proceedings of the 24th national conference of the ACM. ACM, 1969

work page 1969

[11] [11]

Davis and K

T. Davis and K. Stanley. Sparse LU factorization of circuit simulation matrices. InNumerical Aspects of Circuit and Device Modeling Workshop, June 2004

work page 2004

[12] [12]

T. A. Davis.Direct Methods for Sparse Linear Systems. SIAM Publications, 2006

work page 2006

[13] [13]

ACM SIGNUM Newslett., 20:2–18, 1985

D.DodsonandJ.G.Lewis.Issuesrelatingtoextensionofthebasiclinearalgebrasubprograms. ACM SIGNUM Newslett., 20:2–18, 1985

work page 1985

[14] [14]

J. J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson. Issues relating to extension of the basic linear algebra subprograms.ACM SIGNUM Newslett., 20:2–18, 1985

work page 1985

[15] [15]

I. S. Duﬀ, A. Erisman, and J. Reid.Direct Methods for Sparse Matrices. Oxford University Press, 1986

work page 1986

[16] [16]

I. S. Duﬀ and J. Koster. The design and use of algorithms for permuting large entries to the diagonal of sparse matrices.SIAM J. Matrix Anal. Appl., 20:889–901, 1999

work page 1999

[17] [17]

I. S. Duﬀ and S. Pralet. Strategies for scaling and pivoting for sparse symmetric indeﬁnite problems. Technical Report TR/PA/04/59, CERFACS, Toulouse, France, 2004

work page 2004

[18] [18]

Alinear-timeheuristicforimprovingnetworkpartitions

C.M.FiducciaandR.M.Mattheyses. Alinear-timeheuristicforimprovingnetworkpartitions. InProceedings of the 19th Design Automation Conference, pages 175âĂ“–181. IEEE, 1997

work page 1997

[19] [19]

M. Fiedler. A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory.Czecheslovak Mathematical Journal, 25(100):619–633, 1975

work page 1975

[20] [20]

George and J

A. George and J. W. H. Liu. The evolution of the minimum degree ordering algorithm.SIAM Review, 31:1–19, 1989

work page 1989

[21] [21]

J. A. George and J. W. Liu.Computer Solution of Large Sparse Positive Deﬁnite Systems. Prentice-Hall, Englewood Cliﬀs, NJ, USA, 1981

work page 1981

[22] [22]

J. A. George and E. Ng. An implementation of Gaussian elimination with partial pivoting for sparse systems.SIAM J. Sci. Statist. Comput., 6(2):390–409, 1985

work page 1985

[23] [23]

Predictingstructureinsparsematrixcomputations

J.R.Gilbert. Predictingstructureinsparsematrixcomputations. SIAMJ.MatrixAnal.Appl. , 15(1):162–79, 1994

work page 1994

[24] [24]

Gupta and L

A. Gupta and L. Ying. On algorithms for ﬁnding maximum matchings in bipartite graphs. Technical Report RC 21576 (97320), IBM T. J. Watson Research Center, Yorktown Heights, NY, October 25, 1999

work page 1999

[25] [25]

Karypis and V

G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientiﬁc Computing, 20(1):359–392, 1998

work page 1998

[26] [26]

Karypis, K

G. Karypis, K. Schloegel, and V. Kumar.ParMeTis: Parallel Graph Partitioning and Sparse Matrix Ordering Library, Version 2.0. University of Minnesota, Dept. of Computer Science, September 1999

work page 1999

[27] [27]

Kernighan and S

B. Kernighan and S. Lin. An eﬃcient heuristic procedure for partitioning graphs.The Bell System Technical Journal, 29(2):291–307, 1970

work page 1970

[28] [28]

OnparallelpushâĂ“relabelbasedalgorithmsforbipartite maximum matching.Parallel Comput., 40(7):289–308, 2014

J.Langguth,A.Azad,andF.Manne. OnparallelpushâĂ“relabelbasedalgorithmsforbipartite maximum matching.Parallel Comput., 40(7):289–308, 2014

work page 2014

[29] [29]

Langguth, M

J. Langguth, M. M. A. Patwary, and F. Manne. Parallel algorithms for bipartite matching problems on distributed memory computers.Parallel Comput., 37(12):820–845, 2011. State-of-The-Art Sparse Direct Solvers 29

work page 2011

[30] [30]

LaSalle and G

D. LaSalle and G. Karypis. Multi–threaded graph partitioning. Technical report, Department of Computer Science & Engineering, University of Minnesota, Minneapolis, 2013

work page 2013

[31] [31]

W.-K. Lee, R. Achar, and M. S. Nakhla. Dynamic gpu parallel sparse lu factorization for fast circuit simulation.IEEE Transactions on Very Large Scale Integration (VLSI) Systems, pages 1–12, 2018

work page 2018

[32] [32]

X. S. Li. An overview of SuperLU: Algorithms, implementation, and user interface.ACM Trans. Math. Softw., 31(3):302–325, 2005

work page 2005

[33] [33]

J. W. Liu. The role of elimination trees in sparse factorization.SIAM J. Matrix Anal. Appl., 11(1):134–172, 1990

work page 1990

[34] [34]

J. W. Liu and A. Sherman. Comparative analysis of the Cuthill–McKee and the reverse Cuthill–McKee ordering algorithms for sparse matrices.SIAM J. Numer. Anal., 13:198–213, 1976

work page 1976

[35] [35]

H. M. Markowitz. The elimination form of the inverse and its application to linear program- ming. Management Science, 3:255–269, April 1957

work page 1957

[36] [36]

SPICE2:acomputerprogramtosimulatesemiconductorcircuits

L.W.Nagel. SPICE2:acomputerprogramtosimulatesemiconductorcircuits. Memorandum No. ERL-M520, University of California, Berkeley, California, May 1975

work page 1975

[37] [37]

Olschowka and A

M. Olschowka and A. Neumaier. A new pivoting strategy for Gaussian elimination.Linear Algebra and its Applications, 240:131–151, 1996

work page 1996

[38] [38]

Pellegrini, J

F. Pellegrini, J. Roman, and P. Amestoy. Hybridizing nested dissection and halo approximate minimum degree for eﬃcient sparse matrix ordering.Concurrency: Practice and Experience, 12:69–84, 2000

work page 2000

[39] [39]

Rewienski

M. Rewienski. A perspective on fast-spice simulation technology. Springer, Dordrecht, 2011

work page 2011

[40] [40]

Agraph-theoreticstudyofthenumericalsolutionofsparsepositivedeﬁnitesystems of linear equations

D.J.Rose. Agraph-theoreticstudyofthenumericalsolutionofsparsepositivedeﬁnitesystems of linear equations. In R. C. Read, editor,Graph Theory and Computing. Academic Press, 1972

work page 1972

[41] [41]

O. Schenk. Scalable parallel sparse LU factorization methods on shared memory multipro- cessors. PhD thesis, ETH Zurich, 2000. Diss. Technische Wissenschaften ETH Zurich, Nr. 13515, 2000

work page 2000

[42] [42]

Schenk and K

O. Schenk and K. Gärtner. Solving unsymmetric sparse systems of linear equations with PARDISO. Journal of Future Generation Computer Systems, 20(3):475–487, 2004

work page 2004

[43] [43]

R. E. Tarjan. Data structures and network algorithms. InCBMS–NSF Regional Conference Series in Applied Mathematics, volume 44, 1983

work page 1983

[44] [44]

X.Zhao,L.Han,andZ.Feng. Aperformance-guidedgraphsparsiﬁcationapproachtoscalable and robust spice-accurate integrated circuit simulations.IEEE Transactions on Computer- Aided Design of Integrated Circuits and Systems, 34:1639–1651, 2015

work page 2015