pith. sign in

arxiv: 2509.08632 · v1 · submitted 2025-09-10 · 💻 cs.PF

Memshare: Memory Sharing for Multicore Computation in R with an Application to Feature Selection by Mutual Information using PDE

Pith reviewed 2026-05-18 17:39 UTC · model grok-4.3

classification 💻 cs.PF
keywords shared memorymulticore computationR languageALTREPmutual informationfeature selectionPareto density estimationRNA-seq analysis
0
0 comments X

The pith

Memshare enables shared-memory multicore computation in R by allocating C++ buffers exposed through ALTREP views, delivering a 2x speedup over SharedObject with no extra resident memory.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces memshare as a package that supports multicore work in R without duplicating large data structures across threads. It works by creating shared-memory buffers in C++ and presenting them to R as ALTREP views. The authors compare this method to the existing SharedObject package, examine semantics and safety, and test it on a column-wise apply workload. Results show twice the speed of SharedObject while using no additional resident memory. The same mechanism is then used on a large RNA-seq dataset to select features by mutual information with Pareto density estimation per gene.

Core claim

Memshare allocates buffers in C++ shared memory and exposes them to R through ALTREP views. This design allows parallel R sessions to share data without copying. In a column-wise apply benchmark it produces a 2x speedup compared with SharedObject and adds no extra resident memory. The approach is demonstrated on an RNA-seq matrix of 10,446 cases by 19,637 genes, where per-feature density estimates for mutual-information feature selection would otherwise require roughly n_threads times 10 GB of memory.

What carries the argument

ALTREP views backed by C++ shared-memory buffers that let multiple R processes read the same data without duplication or copying.

If this is right

  • Large matrices can be processed in parallel inside R without multiplying memory use by the number of threads.
  • Feature-selection pipelines that estimate densities per variable become feasible on high-dimensional data sets that previously exceeded available RAM.
  • Existing R code using apply-style operations can gain speed by switching to the shared-memory views with minimal changes.
  • The same buffer mechanism can support other embarrassingly parallel statistical tasks that currently force data duplication.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The technique could be applied to other high-memory R workflows such as bootstrap resampling or cross-validation on genomic matrices.
  • Integration with additional parallel back-ends might further reduce the gap between R and lower-level languages for big-data analytics.
  • If the ALTREP layer remains stable, similar shared-memory patterns could appear in other interpreted languages that expose low-level buffer interfaces.

Load-bearing premise

The shared-memory ALTREP views preserve R's expected semantics and thread safety across the tested workloads without introducing data races or crashes.

What would settle it

Re-running the column-wise apply benchmark on the same hardware with memshare and finding either no speedup, higher resident memory, or runtime crashes would falsify the reported performance and safety claims.

Figures

Figures reproduced from arXiv: 2509.08632 by Julian M\"arte, Michael C. Thrun.

Figure 1
Figure 1. Figure 1: A schematic about where the memory is located and how different sessions access it. [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Median runtime (log-scale) vs matrix size for memshare, SharedObject, and serial baseline; ribbons show IQR across 100 runs. Insets show total RSS (MB) during the run relative to idle [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The distribution of mutual information for 19637 gene expressions as a histogram, pareto density estimation [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: First Screenshot of ShareObjects Computation. [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Second Screenshot of ShareObjects Computation. [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
read the original abstract

We present memshare\footnote{The Software package is published as a CRAN package under https://CRAN.R-project.org/package=memshare, a package that enables shared memory multicore computation in R by allocating buffers in C++ shared memory and exposing them to R through ALTREP views. We compare memshare to SharedObject (Bioconductor) discuss semantics and safety, and report a 2x speedup over SharedObject with no additional resident memory in a column wise apply benchmark. Finally, we illustrate a downstream analytics use case: feature selection by mutual information in which densities are estimated per feature via Pareto Density Estimation (PDE). The analytical use-case is an RNA seq dataset consisting of N=10,446 cases and d=19,637 gene expressions requiring roughly n_threads * 10GB of memory in the case of using parallel R sessions. Such and larger use-cases are common in big data analytics and make R feel limiting sometimes which is mitigated by the addition of the library presented in this work.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the memshare CRAN package, which enables shared-memory multicore computation in R by allocating buffers in C++ shared memory and exposing them to R via ALTREP views. It compares memshare to SharedObject, discusses semantics and safety, reports a 2x speedup over SharedObject with no additional resident memory in a column-wise apply benchmark, and illustrates a downstream use case of feature selection by mutual information via Pareto Density Estimation on an RNA-seq dataset (N=10,446 cases, d=19,637 features) that would otherwise require roughly n_threads * 10 GB when using separate parallel R sessions.

Significance. If the central performance claim holds under verified thread safety and semantics, the work provides a practical mechanism to reduce memory pressure in parallel R workloads for large-scale analytics, directly addressing a recurring limitation when scaling to datasets of the size shown in the PDE application.

major comments (2)
  1. [Benchmark results] Benchmark results section: the reported 2x speedup over SharedObject is presented without hardware specifications, thread count, timing methodology, error bars, or exclusion criteria; these omissions make it impossible to assess whether the observed improvement is reproducible or generalizes beyond the specific test environment.
  2. [Semantics and safety discussion] Semantics and safety discussion: the claim that ALTREP views maintain R's expected read-only semantics and avoid data races under concurrent access is load-bearing for the validity of the speedup numbers, yet the manuscript provides no explicit validation such as ThreadSanitizer output, data-race test suites, or machine-checked invariants; without this, the benchmark comparison to SharedObject rests on an unverified precondition.
minor comments (2)
  1. [Abstract] The abstract footnote containing the CRAN link would benefit from a more standard citation format.
  2. [Figures] Figure captions for any benchmark plots should explicitly state the number of replications and the exact R version and compiler flags used.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on reproducibility and safety verification. We address each major comment below and have revised the manuscript to incorporate additional details where feasible.

read point-by-point responses
  1. Referee: [Benchmark results] Benchmark results section: the reported 2x speedup over SharedObject is presented without hardware specifications, thread count, timing methodology, error bars, or exclusion criteria; these omissions make it impossible to assess whether the observed improvement is reproducible or generalizes beyond the specific test environment.

    Authors: We agree that these details were insufficient for full reproducibility assessment. In the revised manuscript, the Benchmark results section now specifies the hardware (Intel Xeon Gold 6248R CPU, 192 GB RAM), thread counts tested (4, 8, and 16), timing methodology (microbenchmark package with 100 iterations, reporting median and IQR), error bars (interquartile range across runs), and exclusion criteria (none applied, as variance was low with no runs exceeding 2 SD). The complete benchmark script has also been added to the package's GitHub repository for independent verification. revision: yes

  2. Referee: [Semantics and safety discussion] Semantics and safety discussion: the claim that ALTREP views maintain R's expected read-only semantics and avoid data races under concurrent access is load-bearing for the validity of the speedup numbers, yet the manuscript provides no explicit validation such as ThreadSanitizer output, data-race test suites, or machine-checked invariants; without this, the benchmark comparison to SharedObject rests on an unverified precondition.

    Authors: The manuscript's safety discussion is based on the design that ALTREP provides immutable read-only access to the underlying C++ shared-memory buffers, preventing writes from R during parallel operations and thereby avoiding data races. We acknowledge the absence of tool-based verification such as ThreadSanitizer. The revision adds a dedicated paragraph elaborating the invariants (read-only mapping, no R-side mutation, and reliance on POSIX shared memory semantics) and describes results from manual stress tests with concurrent access. We were unable to generate and include ThreadSanitizer output in this revision cycle due to current build constraints but will provide it in supplementary materials or a follow-up note. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical benchmarks are independent measurements

full rationale

The paper presents a software implementation for shared-memory multicore computation in R via C++ buffers exposed through ALTREP, along with direct empirical benchmarks comparing performance and memory usage to SharedObject. No mathematical derivation chain, fitted parameters, or predictions exist that could reduce to inputs by construction. Central claims rest on benchmark results and implementation details rather than self-referential equations or load-bearing self-citations. The work is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a software-implementation contribution; no mathematical free parameters, domain axioms, or new postulated entities are introduced. The work relies on standard C++ shared-memory primitives and R's existing ALTREP interface.

pith-pipeline@v0.9.0 · 5713 in / 1136 out tokens · 62041 ms · 2026-05-18T17:39:33.482708+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages

  1. [1]

    Firebrowse (rrid:scr\_026320)

    Broad Institute of MIT and Harvard . Firebrowse (rrid:scr\_026320). http://firebrowse.org/, 2025. Accessed via https://gdac.broadinstitute.org

  2. [2]

    doParallel: Foreach Parallel Adaptor for the 'parallel' Package, 2025

    Microsoft Corporation and Steve Weston. doParallel: Foreach Parallel Adaptor for the 'parallel' Package, 2025. URL https://github.com/revolutionanalytics/doparallel. R package version 1.0.17

  3. [3]

    Emerson, and Stephen Weston

    Michael Kane, John W. Emerson, and Stephen Weston. Scalable strategies for computing with massive data. Journal of Statistical Software, 55 0 (14): 0 1--19, 2013. doi:10.18637/jss.v055.i14. URL https://doi.org/10.18637/jss.v055.i14

  4. [4]

    Bo Li and Colin N. Dewey. Rsem: accurate transcript quantification from rna-seq data with or without a reference genome. BMC Bioinformatics, 12 0 (1): 0 323, 2011. doi:10.1186/1471-2105-12-323. URL https://doi.org/10.1186/1471-2105-12-323

  5. [5]

    Thrun and Julian Märte

    Michael C. Thrun and Julian Märte . memshare. https://github.com/Mthrun/memshare, 2025

  6. [6]

    Julian Märte and Michael C. Thrun. memshare, 2025. URL https://cran.r-project.org/package=memshare. CRAN published R package

  7. [7]

    Florian Prive, Hugues Aschard, Andrey Ziyatdinov, and Michael G. B. Blum. Efficient analysis of large-scale genome-wide data with two r packages: bigstatsr and bigsnpr. Bioinformatics, 34 0 (16): 0 2781--2787, 2018. doi:10.1093/bioinformatics/bty185. URL https://doi.org/10.1093/bioinformatics/bty185

  8. [8]

    Support for Parallel Computation in R, 2025

    R Core Team. Support for Parallel Computation in R, 2025. URL https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf. R package 'parallel' version included in R

  9. [9]

    Thrun and Julian Märte

    Michael C. Thrun and Julian Märte. Genexpressions dataset derived from firebrowse, 2025. URL https://zenodo.org/records/16937028

  10. [10]

    Thrun, Tim Gehlert, and Alfred Ultsch

    Michael C. Thrun, Tim Gehlert, and Alfred Ultsch. Analyzing the fine structure of distributions. PLOS ONE, 15 0 (10): 0 e0238835, 2020. doi:10.1371/journal.pone.0238835. URL https://doi.org/10.1371/journal.pone.0238835

  11. [11]

    Pareto density estimation: A density estimation for knowledge discovery

    Alfred Ultsch. Pareto density estimation: A density estimation for knowledge discovery. In Proceedings of the 28th Annual Conference of the German Classification Society (GfKl), Studies in Classification, Data Analysis, and Knowledge Organization, pages 91--98. Springer, 2005

  12. [12]

    Alfred Ultsch. Is log ratio a good value for measuring return in stock investments? In Advances in Data Analysis, Data Handling and Business Intelligence, Studies in Classification, Data Analysis, and Knowledge Organization, pages 505--511. Springer, 2008

  13. [13]

    SharedObject: Sharing R objects across multiple R processes without memory duplication, 2025

    Jiefei Wang and Martin Morgan. SharedObject: Sharing R objects across multiple R processes without memory duplication, 2025. URL https://bioconductor.org/packages/SharedObject. R package version (Bioconductor Release)