archive
Every paper Pith has read. Search by title, abstract, or pith.
79 papers in cs.MS · page 2
-
GPU batch SVD solver beats vendors via Jacobi method
An Efficient Batch Solver for the Singular Value Decomposition on GPUs
-
FastTwoSum works error-free for wider inputs under faithful rounding
Odd but Error-Free FastTwoSum: More General Conditions for FastTwoSum as an Error-Free Transformation for Faithful Rounding Modes
-
Agentic LLMs reach 72 percent success on finite-element code tasks
ALL-FEM: Agentic Large Language models Fine-tuned for Finite Element Methods
-
GlycoPy enables NMPC on complex bioprocess models
GlycoPy: A CasADi-based Python Framework for Hierarchical Modeling, Optimization, and Control of Bioprocesses
-
Models emulate NVIDIA Tensor Core behavior in low precision
Accurate Models of NVIDIA Tensor Cores
-
DG discretization removes splitting error from Navier-Stokes solver
A Discontinuous Galerkin Consistent Splitting Method for the Incompressible Navier-Stokes Equations
-
Graph method speeds symmetric tensor canonicalization
SeQuant Framework for Symbolic and Numerical Tensor Algebra. I. Core Capabilities
-
GPU data-movement cuts lower both time and energy for large sparse solves
On the energy efficiency of sparse matrix computations on multi-GPU clusters
-
JAX compiles declarative RF circuits into differentiable functions
ParamRF: A JAX-native Framework for Declarative Circuit Modelling
-
GPU method encloses global minima up to 10,000 dimensions
Global optimization tailored for graphics processing units: Complete and rigorous search for large-scale nonlinear minimization
-
C++ library converts to modules with moderate effort
Experience converting a large mathematical software package written in C++ to C++20 modules
-
Permutations cancel exactly in FFT convolution
Permutation-Avoiding FFT-Based Convolution
-
Bad scaling forces more slices in integer matrix multiply
Analysis of Floating-Point Matrix Multiplication Computed via Integer Arithmetic
-
Scaled KSVD matches SAEs on disentangling model embeddings
DB-KSVD: Scalable Alternating Optimization for Disentangling High-Dimensional Embedding Spaces
-
Skew-symmetric matrices factor faster via BLIS fusions
Performant Tridiagonal Factorization of Skew-Symmetric Matrices
-
New framework speeds fibrous material models 1000x on GPUs
A new open source framework for multiscale modeling of fibrous materials on heterogeneous supercomputers
-
Python package lets users build custom Bayesian optimization loops
NUBO: A Transparent Python Package for Bayesian Optimization
-
Parallel DEC library solves 3D elliptic problems with cracks
Parallelized Discrete Exterior Calculus for Three-Dimensional Elliptic Problems
-
Core array concepts drive scientific Python computing
Array Programming with NumPy
-
PyTorch proves imperative Python style can match both ease and speed
PyTorch: An Imperative Style, High-Performance Deep Learning Library
-
SciPy 1.0 cements its role as Python's standard scientific library
SciPy 1.0--Fundamental Algorithms for Scientific Computing in Python
-
Hermite-like basis shrinks DG stencil to one value plus one derivative
A Hermite-like basis for faster matrix-free evaluation of interior penalty discontinuous Galerkin operators
-
Out-of-core randomized SVD factors huge matrices in small memory
Out-of-core singular value decomposition
-
R package supplies Bayesian models for psychology data
bayes4psy -- an Open Source R Package for Bayesian Statistics in Psychology
-
C++ design interpolates data in any number of dimensions
Multi-dimensional interpolations in C++
-
Sparse support choice yields 157 GFlop/s matrix-free FE ops
Algorithms and data structures for matrix-free finite element operators with MPI-parallel sparse multi-vectors
-
phcpy adds JupyterHub and GPU support for polynomial solvers
Solving Polynomial Systems with phcpy
-
Correction restores accuracy of Faddeyeva function near real axis
Remark on Algorithm 680: evaluation of the complex error function: Cause and Remedy for the Loss of Accuracy Near the Real Axis
-
Modular design separates particle models from parallel code details
A Modular and Extensible Software Architecture for Particle Dynamics
-
Devito adds OPS backend for GPU speedups
Investigating the OPS intermediate representation to target GPUs in the Devito DSL
-
Octave package implements spectral methods for differential equations
SPSMAT: GNU Octave software package for spectral and pseudospectral methods
-
Generator tailors linear algebra routines to application needs
Program Generation for Linear Algebra Using Multiple Layers of DSLs