Processing-in-memory for genomics workloads
Pith reviewed 2026-05-19 11:58 UTC · model grok-4.3
The pith
Co-designing genomics algorithms with processing-in-memory hardware can cut energy use, costs, and time for DNA sequencing analysis.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that co-designing algorithms and data structures commonly used in genomics with several PIM architectures will achieve the highest cost, energy, and time savings for processing high-throughput DNA and RNA sequencing data, enabling analysis without energy-hungry computer clusters or cloud platforms.
What carries the argument
Co-design of existing genomics algorithms and data structures with PIM architectures, allowing computation to occur at the memory location to reduce data movement.
Load-bearing premise
That co-designing existing genomics algorithms and data structures with PIM hardware will produce substantially higher energy, cost, and time savings than conventional cluster-based processing.
What would settle it
A side-by-side measurement on a representative genomics task such as read mapping or variant calling showing that the PIM-co-designed version requires more total energy or more wall-clock time than the same task on a standard cluster.
Figures
read the original abstract
Low-cost, high-throughput DNA and RNA sequencing (HTS) data is the backbone of the life sciences. Genome sequencing is now becoming a part of Predictive, Preventive, Personalized, and Participatory (termed 'P4') medicine. All genomic data are currently processed in energy-hungry computer clusters and centers, necessitating data transfer, consuming substantial energy, and wasting valuable time. Therefore, there is a need for fast, energy-efficient, and cost-efficient technologies that enable genomics research without requiring data centers and cloud platforms. We recently launched the BioPIM Project to leverage emerging processing-in-memory (PIM) technologies to enable energy- and cost-efficient analysis of bioinformatics workloads. The BioPIM Project focuses on co-designing algorithms and data structures commonly used in genomics with several PIM architectures to achieve the highest cost, energy, and time savings.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript announces the recent launch of the BioPIM Project, which aims to leverage processing-in-memory (PIM) technologies by co-designing commonly used genomics algorithms and data structures with several PIM architectures. The goal is to enable energy-efficient, cost-efficient, and fast analysis of high-throughput sequencing (HTS) data for bioinformatics workloads, reducing reliance on energy-hungry computer clusters and data centers in support of P4 medicine.
Significance. If the envisioned co-design efforts prove successful, the work could have substantial significance for sustainable genomics by lowering energy consumption and costs associated with the growing volumes of HTS data. However, the manuscript contains no results, benchmarks, or specific technical details, so its contribution is limited to a forward-looking project description rather than demonstrated advances.
major comments (1)
- The manuscript supplies no data, benchmarks, derivations, error analysis, or preliminary results to support the feasibility of achieving the highest cost, energy, and time savings via co-design (abstract). This absence is load-bearing because the central claim is framed as a project goal whose value depends on eventual outcomes that cannot be evaluated from the given text.
Simulated Author's Rebuttal
We thank the referee for reviewing our manuscript on the BioPIM Project. We address the major comment below and clarify the intended scope of this work as a project announcement.
read point-by-point responses
-
Referee: The manuscript supplies no data, benchmarks, derivations, error analysis, or preliminary results to support the feasibility of achieving the highest cost, energy, and time savings via co-design (abstract). This absence is load-bearing because the central claim is framed as a project goal whose value depends on eventual outcomes that cannot be evaluated from the given text.
Authors: We agree that the manuscript contains no empirical results, benchmarks, or derivations, as it is explicitly framed as an announcement of the recently launched BioPIM Project rather than a report of completed technical work. The abstract and text describe the motivation and high-level co-design goals without claiming demonstrated outcomes. Project announcements of this type are common in emerging interdisciplinary areas to outline objectives, attract collaborators, and stimulate discussion prior to results. The value lies in defining the research direction for sustainable genomics processing. To strengthen the manuscript, we can expand it with additional details on the specific genomics algorithms and PIM architectures targeted for co-design, as well as planned evaluation criteria. revision: partial
- We cannot supply data, benchmarks, or preliminary results because the BioPIM Project has only recently been initiated and no such experiments have been performed yet.
Circularity Check
No significant circularity in project announcement
full rationale
The manuscript is a project announcement describing the launch and goals of the BioPIM Project rather than a completed study asserting measured performance gains or presenting mathematical derivations. It contains no equations, fitted parameters, predictions, uniqueness theorems, or ansatzes that could reduce to self-citations or inputs by construction. The central statements are declarations of intent to co-design algorithms with PIM architectures, which are not empirical claims requiring internal validation or reduction to prior results within the paper. No load-bearing steps exhibit any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption All genomic data are currently processed in energy-hungry computer clusters and centers, necessitating data transfer.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The BioPIM Project focuses on co-designing algorithms and data structures commonly used in genomics with several PIM architectures to achieve the highest cost, energy, and time savings.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
The Sequence Read Archive: a decade more of explosive growth.,
K. Katz, O. Shutov, R. Lapoint, M. Kimelman, J. R. Brister, and C. O’Sullivan, “The Sequence Read Archive: a decade more of explosive growth.,” Nucleic acids research, vol. 50, pp. D387– D390, Jan. 2022
work page 2022
-
[2]
Genomic analysis in the age of human genome sequencing.,
T. Lappalainen, A. J. Scott, M. Brandt, and I. M. Hall, “Genomic analysis in the age of human genome sequencing.,” Cell, vol. 177, pp. 70–84, Mar. 2019
work page 2019
-
[3]
A survey of best practices for RNA-seq data analysis.,
A. Conesa, P. Madrigal, S. Tarazona, D. Gomez-Cabrero, A. Cervera, A. McPherson, M. W. Szcze´ sniak, D. J. Gaffney, L. L. Elo, X. Zhang, and A. Mortazavi, “A survey of best practices for RNA-seq data analysis.,” Genome Biology, vol. 17, p. 13, Jan. 2016
work page 2016
-
[4]
Shouji: a fast and efficient pre- alignment filter for sequence alignment.,
M. Alser, H. Hassan, A. Kumar, O. Mutlu, and C. Alkan, “Shouji: a fast and efficient pre- alignment filter for sequence alignment.,” Bioinformatics, vol. 35, pp. 4255–4263, Nov. 2019
work page 2019
-
[5]
M. Mognol, D. Lavenier, and J. Legriel, “Parallelization of the banded Needleman & Wunsch algorithm on UPMEM PiM architecture for long DNA sequence alignment,” in Proceedings of the 53rd International Conference on Parallel Processing, ICPP 2024, Gotland, Sweden, August 12-15, 2024 , pp. 1062–1071, ACM, 2024
work page 2024
-
[6]
A framework for high-throughput sequence alignment using real processing-in-memory systems.,
S. Diab, A. Nassereldine, M. Alser, J. G´ omez Luna, O. Mutlu, and I. El Hajj, “A framework for high-throughput sequence alignment using real processing-in-memory systems.,” Bioinfor- matics, vol. 39, May 2023
work page 2023
-
[7]
MiMyCS: A processing-in-memory read mapper for compressing next-gen sequencing datasets,
F. D. Moor, M. Mognol, C. Deltel, E. Drezen, J. Legriel, and D. Lavenier, “MiMyCS: A processing-in-memory read mapper for compressing next-gen sequencing datasets,” (Lisbon, Portugal), pp. 6716–6723, IEEE, 2024
work page 2024
-
[8]
GAPiM: Discovering genetic variations on a real processing-in-memory system,
N. Abecassis, J. G´ omez-Luna, O. Mutlu, R. Ginosar, A. Moisson-Franckhauser, and L. Yavits, “GAPiM: Discovering genetic variations on a real processing-in-memory system,” bioRxiv, 2023. 12
work page 2023
-
[9]
CiMBA: Accelerating genome sequencing through on-device basecalling via compute-in-memory,
W. A. Simon, I. Boybat, R. Kodra, E. Ferro, G. Singh, M. Alser, S. Jain, H. Tsai, G. W. Burr, O. Mutlu, and A. Sebastian, “CiMBA: Accelerating genome sequencing through on-device basecalling via compute-in-memory,” IEEE Trans. Parallel Distributed Syst. , vol. 36, no. 6, pp. 1130–1145, 2025
work page 2025
-
[10]
GCOC: A genome classifier-on-chip based on similarity search content addressable memory,
Y. Harary, P. Snapir, S. S. Tov, C. Kruphman, E. Rechef, Z. Jahshan, E. Garz´ on, and L. Yavits, “GCOC: A genome classifier-on-chip based on similarity search content addressable memory,” IEEE Transactions on Biomedical Circuits and Systems , 2024
work page 2024
-
[11]
DIPER: Detection and identification of pathogens using edit distance-tolerant resistive cam,
I. Merlin, E. Garz´ on, A. Fish, and L. Yavits, “DIPER: Detection and identification of pathogens using edit distance-tolerant resistive cam,” IEEE Transactions on Computers , vol. 73, no. 10, pp. 2463–2473, 2023
work page 2023
-
[12]
DASH-CAM: dynamic approximate search content addressable memory for genome classification,
Z. Jahshan, I. Merlin, E. Garz´ on, and L. Yavits, “DASH-CAM: dynamic approximate search content addressable memory for genome classification,” in Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 1453–1465, 2023
work page 2023
-
[13]
MajorK: Majority based kmer matching in commodity dram,
Z. Jahshan and L. Yavits, “MajorK: Majority based kmer matching in commodity dram,” IEEE Computer Architecture Letters, 2024
work page 2024
-
[14]
Y. Falevoz and J. Legriel, “Energy efficiency impact of processing in memory: A comprehensive review of workloads on the UPMEM architecture,” in Euro-Par 2023: Parallel Processing Workshops (D. Zeinalipour, D. Blanco Heras, G. Pallis, H. Herodotou, D. Trihinas, D. Balouek, P. Diehl, T. Cojean, K. F¨ urlinger, M. H. Kirkeby, M. Nardelli, and P. Di Sanzo, e...
work page 2023
-
[15]
Streaming algorithms for embedding and computing edit distance in the low distance regime,
D. Chakraborty, E. Goldenberg, and M. Kouck´ y, “Streaming algorithms for embedding and computing edit distance in the low distance regime,” in Proceedings of the forty-eighth annual ACM symposium on Theory of Computing , STOC ’16, (New York, NY, USA), pp. 712–725, Association for Computing Machinery, June 2016
work page 2016
-
[16]
Cimba: Accelerating genome sequencing through on-device basecalling via compute-in-memory,
W. A. Simon, I. Boybat, R. Kodra, E. Ferro, G. Singh, M. Alser, S. Jain, H. Tsai, G. W. Burr, O. Mutlu, and A. Sebastian, “Cimba: Accelerating genome sequencing through on-device basecalling via compute-in-memory,” IEEE Transactions on Parallel and Distributed Systems , pp. 1–15, 2025
work page 2025
-
[17]
Aihwkit-lightning: a scalable hw-aware training toolkit for analog in-memory com- puting,
J. B¨ uchel, W. A. Simon, C. Lammie, G. Acampa, K. El Maghraoui, M. Le Gallo, and A. Se- bastian, “Aihwkit-lightning: a scalable hw-aware training toolkit for analog in-memory com- puting,” in NeurIPS 2024 Workshop Machine Learning with new Compute Paradigms , 2024
work page 2024
-
[18]
N. M. Ghiasi, M. Sadrosadati, H. Mustafa, A. Gollwitzer, C. Firtina, J. Eudine, H. Mao, J. Lindegger, M. B. Cavlak, M. Alser, J. Park, and O. Mutlu, “MegIS: High-performance, energy-efficient, and low-cost metagenomic analysis with in-storage processing,” in Proceedings of the 51st Annual International Symposium on Computer Architecture (ISCA), (Los Alami...
work page 2024
-
[19]
J. Park, R. Azizi, G. F. Oliveira, M. Sadrosadati, R. Nadig, D. Novo, J. G´ omez-Luna, M. Kim, and O. Mutlu, “Flash-cosmos: In-flash bulk bitwise operations using inherent computation capability of nand flash memory,” 2022. 13
work page 2022
-
[20]
Venice: Improving solid-state drive parallelism at low cost via conflict-free accesses,
R. Nadig, M. Sadrosadati, H. Mao, N. M. Ghiasi, A. Tavakkol, J. Park, H. Sarbazi-Azad, J. G. Luna, and O. Mutlu, “Venice: Improving solid-state drive parallelism at low cost via conflict-free accesses,” in Proceedings of the 50th Annual International Symposium on Com- puter Architecture, ISCA ’23, (New York, NY, USA), Association for Computing Machinery, 2023
work page 2023
-
[21]
PIM-AI: A novel architecture for high-efficiency LLM inference,
C. Ortega, Y. Falevoz, and R. Ayrignac, “PIM-AI: A novel architecture for high-efficiency LLM inference,” 2024. 14
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.