Memshare: Memory Sharing for Multicore Computation in R with an Application to Feature Selection by Mutual Information using PDE
Pith reviewed 2026-05-18 17:39 UTC · model grok-4.3
The pith
Memshare enables shared-memory multicore computation in R by allocating C++ buffers exposed through ALTREP views, delivering a 2x speedup over SharedObject with no extra resident memory.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Memshare allocates buffers in C++ shared memory and exposes them to R through ALTREP views. This design allows parallel R sessions to share data without copying. In a column-wise apply benchmark it produces a 2x speedup compared with SharedObject and adds no extra resident memory. The approach is demonstrated on an RNA-seq matrix of 10,446 cases by 19,637 genes, where per-feature density estimates for mutual-information feature selection would otherwise require roughly n_threads times 10 GB of memory.
What carries the argument
ALTREP views backed by C++ shared-memory buffers that let multiple R processes read the same data without duplication or copying.
If this is right
- Large matrices can be processed in parallel inside R without multiplying memory use by the number of threads.
- Feature-selection pipelines that estimate densities per variable become feasible on high-dimensional data sets that previously exceeded available RAM.
- Existing R code using apply-style operations can gain speed by switching to the shared-memory views with minimal changes.
- The same buffer mechanism can support other embarrassingly parallel statistical tasks that currently force data duplication.
Where Pith is reading between the lines
- The technique could be applied to other high-memory R workflows such as bootstrap resampling or cross-validation on genomic matrices.
- Integration with additional parallel back-ends might further reduce the gap between R and lower-level languages for big-data analytics.
- If the ALTREP layer remains stable, similar shared-memory patterns could appear in other interpreted languages that expose low-level buffer interfaces.
Load-bearing premise
The shared-memory ALTREP views preserve R's expected semantics and thread safety across the tested workloads without introducing data races or crashes.
What would settle it
Re-running the column-wise apply benchmark on the same hardware with memshare and finding either no speedup, higher resident memory, or runtime crashes would falsify the reported performance and safety claims.
Figures
read the original abstract
We present memshare\footnote{The Software package is published as a CRAN package under https://CRAN.R-project.org/package=memshare, a package that enables shared memory multicore computation in R by allocating buffers in C++ shared memory and exposing them to R through ALTREP views. We compare memshare to SharedObject (Bioconductor) discuss semantics and safety, and report a 2x speedup over SharedObject with no additional resident memory in a column wise apply benchmark. Finally, we illustrate a downstream analytics use case: feature selection by mutual information in which densities are estimated per feature via Pareto Density Estimation (PDE). The analytical use-case is an RNA seq dataset consisting of N=10,446 cases and d=19,637 gene expressions requiring roughly n_threads * 10GB of memory in the case of using parallel R sessions. Such and larger use-cases are common in big data analytics and make R feel limiting sometimes which is mitigated by the addition of the library presented in this work.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the memshare CRAN package, which enables shared-memory multicore computation in R by allocating buffers in C++ shared memory and exposing them to R via ALTREP views. It compares memshare to SharedObject, discusses semantics and safety, reports a 2x speedup over SharedObject with no additional resident memory in a column-wise apply benchmark, and illustrates a downstream use case of feature selection by mutual information via Pareto Density Estimation on an RNA-seq dataset (N=10,446 cases, d=19,637 features) that would otherwise require roughly n_threads * 10 GB when using separate parallel R sessions.
Significance. If the central performance claim holds under verified thread safety and semantics, the work provides a practical mechanism to reduce memory pressure in parallel R workloads for large-scale analytics, directly addressing a recurring limitation when scaling to datasets of the size shown in the PDE application.
major comments (2)
- [Benchmark results] Benchmark results section: the reported 2x speedup over SharedObject is presented without hardware specifications, thread count, timing methodology, error bars, or exclusion criteria; these omissions make it impossible to assess whether the observed improvement is reproducible or generalizes beyond the specific test environment.
- [Semantics and safety discussion] Semantics and safety discussion: the claim that ALTREP views maintain R's expected read-only semantics and avoid data races under concurrent access is load-bearing for the validity of the speedup numbers, yet the manuscript provides no explicit validation such as ThreadSanitizer output, data-race test suites, or machine-checked invariants; without this, the benchmark comparison to SharedObject rests on an unverified precondition.
minor comments (2)
- [Abstract] The abstract footnote containing the CRAN link would benefit from a more standard citation format.
- [Figures] Figure captions for any benchmark plots should explicitly state the number of replications and the exact R version and compiler flags used.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on reproducibility and safety verification. We address each major comment below and have revised the manuscript to incorporate additional details where feasible.
read point-by-point responses
-
Referee: [Benchmark results] Benchmark results section: the reported 2x speedup over SharedObject is presented without hardware specifications, thread count, timing methodology, error bars, or exclusion criteria; these omissions make it impossible to assess whether the observed improvement is reproducible or generalizes beyond the specific test environment.
Authors: We agree that these details were insufficient for full reproducibility assessment. In the revised manuscript, the Benchmark results section now specifies the hardware (Intel Xeon Gold 6248R CPU, 192 GB RAM), thread counts tested (4, 8, and 16), timing methodology (microbenchmark package with 100 iterations, reporting median and IQR), error bars (interquartile range across runs), and exclusion criteria (none applied, as variance was low with no runs exceeding 2 SD). The complete benchmark script has also been added to the package's GitHub repository for independent verification. revision: yes
-
Referee: [Semantics and safety discussion] Semantics and safety discussion: the claim that ALTREP views maintain R's expected read-only semantics and avoid data races under concurrent access is load-bearing for the validity of the speedup numbers, yet the manuscript provides no explicit validation such as ThreadSanitizer output, data-race test suites, or machine-checked invariants; without this, the benchmark comparison to SharedObject rests on an unverified precondition.
Authors: The manuscript's safety discussion is based on the design that ALTREP provides immutable read-only access to the underlying C++ shared-memory buffers, preventing writes from R during parallel operations and thereby avoiding data races. We acknowledge the absence of tool-based verification such as ThreadSanitizer. The revision adds a dedicated paragraph elaborating the invariants (read-only mapping, no R-side mutation, and reliance on POSIX shared memory semantics) and describes results from manual stress tests with concurrent access. We were unable to generate and include ThreadSanitizer output in this revision cycle due to current build constraints but will provide it in supplementary materials or a follow-up note. revision: partial
Circularity Check
No significant circularity; empirical benchmarks are independent measurements
full rationale
The paper presents a software implementation for shared-memory multicore computation in R via C++ buffers exposed through ALTREP, along with direct empirical benchmarks comparing performance and memory usage to SharedObject. No mathematical derivation chain, fitted parameters, or predictions exist that could reduce to inputs by construction. Central claims rest on benchmark results and implementation details rather than self-referential equations or load-bearing self-citations. The work is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Broad Institute of MIT and Harvard . Firebrowse (rrid:scr\_026320). http://firebrowse.org/, 2025. Accessed via https://gdac.broadinstitute.org
work page 2025
-
[2]
doParallel: Foreach Parallel Adaptor for the 'parallel' Package, 2025
Microsoft Corporation and Steve Weston. doParallel: Foreach Parallel Adaptor for the 'parallel' Package, 2025. URL https://github.com/revolutionanalytics/doparallel. R package version 1.0.17
work page 2025
-
[3]
Michael Kane, John W. Emerson, and Stephen Weston. Scalable strategies for computing with massive data. Journal of Statistical Software, 55 0 (14): 0 1--19, 2013. doi:10.18637/jss.v055.i14. URL https://doi.org/10.18637/jss.v055.i14
-
[4]
Bo Li and Colin N. Dewey. Rsem: accurate transcript quantification from rna-seq data with or without a reference genome. BMC Bioinformatics, 12 0 (1): 0 323, 2011. doi:10.1186/1471-2105-12-323. URL https://doi.org/10.1186/1471-2105-12-323
-
[5]
Michael C. Thrun and Julian Märte . memshare. https://github.com/Mthrun/memshare, 2025
work page 2025
-
[6]
Julian Märte and Michael C. Thrun. memshare, 2025. URL https://cran.r-project.org/package=memshare. CRAN published R package
work page 2025
-
[7]
Florian Prive, Hugues Aschard, Andrey Ziyatdinov, and Michael G. B. Blum. Efficient analysis of large-scale genome-wide data with two r packages: bigstatsr and bigsnpr. Bioinformatics, 34 0 (16): 0 2781--2787, 2018. doi:10.1093/bioinformatics/bty185. URL https://doi.org/10.1093/bioinformatics/bty185
-
[8]
Support for Parallel Computation in R, 2025
R Core Team. Support for Parallel Computation in R, 2025. URL https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf. R package 'parallel' version included in R
work page 2025
-
[9]
Michael C. Thrun and Julian Märte. Genexpressions dataset derived from firebrowse, 2025. URL https://zenodo.org/records/16937028
-
[10]
Thrun, Tim Gehlert, and Alfred Ultsch
Michael C. Thrun, Tim Gehlert, and Alfred Ultsch. Analyzing the fine structure of distributions. PLOS ONE, 15 0 (10): 0 e0238835, 2020. doi:10.1371/journal.pone.0238835. URL https://doi.org/10.1371/journal.pone.0238835
-
[11]
Pareto density estimation: A density estimation for knowledge discovery
Alfred Ultsch. Pareto density estimation: A density estimation for knowledge discovery. In Proceedings of the 28th Annual Conference of the German Classification Society (GfKl), Studies in Classification, Data Analysis, and Knowledge Organization, pages 91--98. Springer, 2005
work page 2005
-
[12]
Alfred Ultsch. Is log ratio a good value for measuring return in stock investments? In Advances in Data Analysis, Data Handling and Business Intelligence, Studies in Classification, Data Analysis, and Knowledge Organization, pages 505--511. Springer, 2008
work page 2008
-
[13]
SharedObject: Sharing R objects across multiple R processes without memory duplication, 2025
Jiefei Wang and Martin Morgan. SharedObject: Sharing R objects across multiple R processes without memory duplication, 2025. URL https://bioconductor.org/packages/SharedObject. R package version (Bioconductor Release)
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.