Processing-in-memory for genomics workloads

Abu Sebastian; Berkan \c{S}ahin; Can Alkan; Dominique Lavenier; Irem Boybat; Klea Zambaku; Konstantina Koliogeorgi; Leonid Yavits; Mohammad Sadrosadati; Onur Mutlu

arxiv: 2506.00597 · v2 · submitted 2025-05-31 · 🧬 q-bio.GN · cs.AR

Processing-in-memory for genomics workloads

William Andrew Simon , Leonid Yavits , Konstantina Koliogeorgi , Yann Falevoz , Yoshihiro Shibuya , Dominique Lavenier , Irem Boybat , Klea Zambaku

show 7 more authors

Berkan \c{S}ahin Mohammad Sadrosadati Onur Mutlu Abu Sebastian Rayan Chikhi The BioPIM Consortium Can Alkan

This is my paper

Pith reviewed 2026-05-19 11:58 UTC · model grok-4.3

classification 🧬 q-bio.GN cs.AR

keywords processing-in-memorygenomicsbioinformaticsenergy efficiencyhigh-throughput sequencingalgorithm co-designP4 medicine

0 comments

The pith

Co-designing genomics algorithms with processing-in-memory hardware can cut energy use, costs, and time for DNA sequencing analysis.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents the BioPIM Project as a way to apply processing-in-memory technologies to bioinformatics workloads. High-throughput sequencing data currently moves through power-hungry clusters that add transfer overhead and delay results. The project redesigns common genomics algorithms and data structures to run directly on PIM hardware, aiming to remove the need for data centers. A reader would care if this shift makes large-scale genomic analysis practical in settings without massive computing infrastructure.

Core claim

The central claim is that co-designing algorithms and data structures commonly used in genomics with several PIM architectures will achieve the highest cost, energy, and time savings for processing high-throughput DNA and RNA sequencing data, enabling analysis without energy-hungry computer clusters or cloud platforms.

What carries the argument

Co-design of existing genomics algorithms and data structures with PIM architectures, allowing computation to occur at the memory location to reduce data movement.

Load-bearing premise

That co-designing existing genomics algorithms and data structures with PIM hardware will produce substantially higher energy, cost, and time savings than conventional cluster-based processing.

What would settle it

A side-by-side measurement on a representative genomics task such as read mapping or variant calling showing that the PIM-co-designed version requires more total energy or more wall-clock time than the same task on a standard cluster.

Figures

Figures reproduced from arXiv: 2506.00597 by Abu Sebastian, Berkan \c{S}ahin, Can Alkan, Dominique Lavenier, Irem Boybat, Klea Zambaku, Konstantina Koliogeorgi, Leonid Yavits, Mohammad Sadrosadati, Onur Mutlu, Rayan Chikhi, The BioPIM Consortium, William Andrew Simon, Yann Falevoz, Yoshihiro Shibuya.

**Figure 2.** Figure 2: Frameworks of processing a) near and b) in memory. Each method can be accomplished [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: The AL-Dorado hybrid CNN-LSTM basecalling network (a) is optimized to map efficiently [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: GCOC genome classifier SoC: (a) GCOC evaluation setup, (b) GCOC SoC test board [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

read the original abstract

Low-cost, high-throughput DNA and RNA sequencing (HTS) data is the backbone of the life sciences. Genome sequencing is now becoming a part of Predictive, Preventive, Personalized, and Participatory (termed 'P4') medicine. All genomic data are currently processed in energy-hungry computer clusters and centers, necessitating data transfer, consuming substantial energy, and wasting valuable time. Therefore, there is a need for fast, energy-efficient, and cost-efficient technologies that enable genomics research without requiring data centers and cloud platforms. We recently launched the BioPIM Project to leverage emerging processing-in-memory (PIM) technologies to enable energy- and cost-efficient analysis of bioinformatics workloads. The BioPIM Project focuses on co-designing algorithms and data structures commonly used in genomics with several PIM architectures to achieve the highest cost, energy, and time savings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a project announcement for the BioPIM consortium rather than a paper with measured results or detailed co-designs.

read the letter

The main point is that the paper introduces the BioPIM Project, a new effort to co-design common genomics algorithms and data structures with several processing-in-memory architectures for lower energy, cost, and time in handling high-throughput sequencing data. It correctly flags the current reliance on power-hungry clusters and the push toward decentralized processing as sequencing moves into routine P4 medicine. The listed team mixes hardware and bioinformatics expertise, which is a reasonable setup for actual co-design work rather than pure hardware or pure algorithm papers. That framing of the problem and the project scope is the clearest part of the write-up. The text stays focused on intent and does not overclaim completed savings or specific speedups. The soft spot is that there are still no benchmarks, no named PIM platforms with concrete mappings, no example workloads like read alignment or variant calling with performance estimates, and no early simulations. The goal of achieving the highest savings is stated as a project focus but cannot be checked against data or derivations yet. This leaves the central assumption about substantial gains over clusters as an open question rather than a supported result. The paper is mainly useful for people already working at the PIM-bioinformatics boundary who want to know the scope and participants of this initiative. A reader looking for new hardware application ideas or collaboration signals could get value from the overview. It does not yet look ready for standard peer review as a full research contribution because the evidential base is thin; a workshop or project-description venue would fit better until some concrete co-design outcomes appear.

Referee Report

1 major / 0 minor

Summary. The manuscript announces the recent launch of the BioPIM Project, which aims to leverage processing-in-memory (PIM) technologies by co-designing commonly used genomics algorithms and data structures with several PIM architectures. The goal is to enable energy-efficient, cost-efficient, and fast analysis of high-throughput sequencing (HTS) data for bioinformatics workloads, reducing reliance on energy-hungry computer clusters and data centers in support of P4 medicine.

Significance. If the envisioned co-design efforts prove successful, the work could have substantial significance for sustainable genomics by lowering energy consumption and costs associated with the growing volumes of HTS data. However, the manuscript contains no results, benchmarks, or specific technical details, so its contribution is limited to a forward-looking project description rather than demonstrated advances.

major comments (1)

The manuscript supplies no data, benchmarks, derivations, error analysis, or preliminary results to support the feasibility of achieving the highest cost, energy, and time savings via co-design (abstract). This absence is load-bearing because the central claim is framed as a project goal whose value depends on eventual outcomes that cannot be evaluated from the given text.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for reviewing our manuscript on the BioPIM Project. We address the major comment below and clarify the intended scope of this work as a project announcement.

read point-by-point responses

Referee: The manuscript supplies no data, benchmarks, derivations, error analysis, or preliminary results to support the feasibility of achieving the highest cost, energy, and time savings via co-design (abstract). This absence is load-bearing because the central claim is framed as a project goal whose value depends on eventual outcomes that cannot be evaluated from the given text.

Authors: We agree that the manuscript contains no empirical results, benchmarks, or derivations, as it is explicitly framed as an announcement of the recently launched BioPIM Project rather than a report of completed technical work. The abstract and text describe the motivation and high-level co-design goals without claiming demonstrated outcomes. Project announcements of this type are common in emerging interdisciplinary areas to outline objectives, attract collaborators, and stimulate discussion prior to results. The value lies in defining the research direction for sustainable genomics processing. To strengthen the manuscript, we can expand it with additional details on the specific genomics algorithms and PIM architectures targeted for co-design, as well as planned evaluation criteria. revision: partial

standing simulated objections not resolved

We cannot supply data, benchmarks, or preliminary results because the BioPIM Project has only recently been initiated and no such experiments have been performed yet.

Circularity Check

0 steps flagged

No significant circularity in project announcement

full rationale

The manuscript is a project announcement describing the launch and goals of the BioPIM Project rather than a completed study asserting measured performance gains or presenting mathematical derivations. It contains no equations, fitted parameters, predictions, uniqueness theorems, or ansatzes that could reduce to self-citations or inputs by construction. The central statements are declarations of intent to co-design algorithms with PIM architectures, which are not empirical claims requiring internal validation or reduction to prior results within the paper. No load-bearing steps exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The provided abstract contains no fitted numerical parameters, no new postulated physical entities, and relies only on the standard domain assumption that current genomics pipelines are energy-intensive because of data movement.

axioms (1)

domain assumption All genomic data are currently processed in energy-hungry computer clusters and centers, necessitating data transfer.
Explicitly stated in the second sentence of the abstract as the motivation for the project.

pith-pipeline@v0.9.0 · 5735 in / 1299 out tokens · 50247 ms · 2026-05-19T11:58:02.779613+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The BioPIM Project focuses on co-designing algorithms and data structures commonly used in genomics with several PIM architectures to achieve the highest cost, energy, and time savings.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages

[1]

The Sequence Read Archive: a decade more of explosive growth.,

K. Katz, O. Shutov, R. Lapoint, M. Kimelman, J. R. Brister, and C. O’Sullivan, “The Sequence Read Archive: a decade more of explosive growth.,” Nucleic acids research, vol. 50, pp. D387– D390, Jan. 2022

work page 2022
[2]

Genomic analysis in the age of human genome sequencing.,

T. Lappalainen, A. J. Scott, M. Brandt, and I. M. Hall, “Genomic analysis in the age of human genome sequencing.,” Cell, vol. 177, pp. 70–84, Mar. 2019

work page 2019
[3]

A survey of best practices for RNA-seq data analysis.,

A. Conesa, P. Madrigal, S. Tarazona, D. Gomez-Cabrero, A. Cervera, A. McPherson, M. W. Szcze´ sniak, D. J. Gaffney, L. L. Elo, X. Zhang, and A. Mortazavi, “A survey of best practices for RNA-seq data analysis.,” Genome Biology, vol. 17, p. 13, Jan. 2016

work page 2016
[4]

Shouji: a fast and efficient pre- alignment filter for sequence alignment.,

M. Alser, H. Hassan, A. Kumar, O. Mutlu, and C. Alkan, “Shouji: a fast and efficient pre- alignment filter for sequence alignment.,” Bioinformatics, vol. 35, pp. 4255–4263, Nov. 2019

work page 2019
[5]

Parallelization of the banded Needleman & Wunsch algorithm on UPMEM PiM architecture for long DNA sequence alignment,

M. Mognol, D. Lavenier, and J. Legriel, “Parallelization of the banded Needleman & Wunsch algorithm on UPMEM PiM architecture for long DNA sequence alignment,” in Proceedings of the 53rd International Conference on Parallel Processing, ICPP 2024, Gotland, Sweden, August 12-15, 2024 , pp. 1062–1071, ACM, 2024

work page 2024
[6]

A framework for high-throughput sequence alignment using real processing-in-memory systems.,

S. Diab, A. Nassereldine, M. Alser, J. G´ omez Luna, O. Mutlu, and I. El Hajj, “A framework for high-throughput sequence alignment using real processing-in-memory systems.,” Bioinfor- matics, vol. 39, May 2023

work page 2023
[7]

MiMyCS: A processing-in-memory read mapper for compressing next-gen sequencing datasets,

F. D. Moor, M. Mognol, C. Deltel, E. Drezen, J. Legriel, and D. Lavenier, “MiMyCS: A processing-in-memory read mapper for compressing next-gen sequencing datasets,” (Lisbon, Portugal), pp. 6716–6723, IEEE, 2024

work page 2024
[8]

GAPiM: Discovering genetic variations on a real processing-in-memory system,

N. Abecassis, J. G´ omez-Luna, O. Mutlu, R. Ginosar, A. Moisson-Franckhauser, and L. Yavits, “GAPiM: Discovering genetic variations on a real processing-in-memory system,” bioRxiv, 2023. 12

work page 2023
[9]

CiMBA: Accelerating genome sequencing through on-device basecalling via compute-in-memory,

W. A. Simon, I. Boybat, R. Kodra, E. Ferro, G. Singh, M. Alser, S. Jain, H. Tsai, G. W. Burr, O. Mutlu, and A. Sebastian, “CiMBA: Accelerating genome sequencing through on-device basecalling via compute-in-memory,” IEEE Trans. Parallel Distributed Syst. , vol. 36, no. 6, pp. 1130–1145, 2025

work page 2025
[10]

GCOC: A genome classifier-on-chip based on similarity search content addressable memory,

Y. Harary, P. Snapir, S. S. Tov, C. Kruphman, E. Rechef, Z. Jahshan, E. Garz´ on, and L. Yavits, “GCOC: A genome classifier-on-chip based on similarity search content addressable memory,” IEEE Transactions on Biomedical Circuits and Systems , 2024

work page 2024
[11]

DIPER: Detection and identification of pathogens using edit distance-tolerant resistive cam,

I. Merlin, E. Garz´ on, A. Fish, and L. Yavits, “DIPER: Detection and identification of pathogens using edit distance-tolerant resistive cam,” IEEE Transactions on Computers , vol. 73, no. 10, pp. 2463–2473, 2023

work page 2023
[12]

DASH-CAM: dynamic approximate search content addressable memory for genome classification,

Z. Jahshan, I. Merlin, E. Garz´ on, and L. Yavits, “DASH-CAM: dynamic approximate search content addressable memory for genome classification,” in Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 1453–1465, 2023

work page 2023
[13]

MajorK: Majority based kmer matching in commodity dram,

Z. Jahshan and L. Yavits, “MajorK: Majority based kmer matching in commodity dram,” IEEE Computer Architecture Letters, 2024

work page 2024
[14]

Energy efficiency impact of processing in memory: A comprehensive review of workloads on the UPMEM architecture,

Y. Falevoz and J. Legriel, “Energy efficiency impact of processing in memory: A comprehensive review of workloads on the UPMEM architecture,” in Euro-Par 2023: Parallel Processing Workshops (D. Zeinalipour, D. Blanco Heras, G. Pallis, H. Herodotou, D. Trihinas, D. Balouek, P. Diehl, T. Cojean, K. F¨ urlinger, M. H. Kirkeby, M. Nardelli, and P. Di Sanzo, e...

work page 2023
[15]

Streaming algorithms for embedding and computing edit distance in the low distance regime,

D. Chakraborty, E. Goldenberg, and M. Kouck´ y, “Streaming algorithms for embedding and computing edit distance in the low distance regime,” in Proceedings of the forty-eighth annual ACM symposium on Theory of Computing , STOC ’16, (New York, NY, USA), pp. 712–725, Association for Computing Machinery, June 2016

work page 2016
[16]

Cimba: Accelerating genome sequencing through on-device basecalling via compute-in-memory,

W. A. Simon, I. Boybat, R. Kodra, E. Ferro, G. Singh, M. Alser, S. Jain, H. Tsai, G. W. Burr, O. Mutlu, and A. Sebastian, “Cimba: Accelerating genome sequencing through on-device basecalling via compute-in-memory,” IEEE Transactions on Parallel and Distributed Systems , pp. 1–15, 2025

work page 2025
[17]

Aihwkit-lightning: a scalable hw-aware training toolkit for analog in-memory com- puting,

J. B¨ uchel, W. A. Simon, C. Lammie, G. Acampa, K. El Maghraoui, M. Le Gallo, and A. Se- bastian, “Aihwkit-lightning: a scalable hw-aware training toolkit for analog in-memory com- puting,” in NeurIPS 2024 Workshop Machine Learning with new Compute Paradigms , 2024

work page 2024
[18]

MegIS: High-performance, energy-efficient, and low-cost metagenomic analysis with in-storage processing,

N. M. Ghiasi, M. Sadrosadati, H. Mustafa, A. Gollwitzer, C. Firtina, J. Eudine, H. Mao, J. Lindegger, M. B. Cavlak, M. Alser, J. Park, and O. Mutlu, “MegIS: High-performance, energy-efficient, and low-cost metagenomic analysis with in-storage processing,” in Proceedings of the 51st Annual International Symposium on Computer Architecture (ISCA), (Los Alami...

work page 2024
[19]

Flash-cosmos: In-flash bulk bitwise operations using inherent computation capability of nand flash memory,

J. Park, R. Azizi, G. F. Oliveira, M. Sadrosadati, R. Nadig, D. Novo, J. G´ omez-Luna, M. Kim, and O. Mutlu, “Flash-cosmos: In-flash bulk bitwise operations using inherent computation capability of nand flash memory,” 2022. 13

work page 2022
[20]

Venice: Improving solid-state drive parallelism at low cost via conflict-free accesses,

R. Nadig, M. Sadrosadati, H. Mao, N. M. Ghiasi, A. Tavakkol, J. Park, H. Sarbazi-Azad, J. G. Luna, and O. Mutlu, “Venice: Improving solid-state drive parallelism at low cost via conflict-free accesses,” in Proceedings of the 50th Annual International Symposium on Com- puter Architecture, ISCA ’23, (New York, NY, USA), Association for Computing Machinery, 2023

work page 2023
[21]

PIM-AI: A novel architecture for high-efficiency LLM inference,

C. Ortega, Y. Falevoz, and R. Ayrignac, “PIM-AI: A novel architecture for high-efficiency LLM inference,” 2024. 14

work page 2024

[1] [1]

The Sequence Read Archive: a decade more of explosive growth.,

K. Katz, O. Shutov, R. Lapoint, M. Kimelman, J. R. Brister, and C. O’Sullivan, “The Sequence Read Archive: a decade more of explosive growth.,” Nucleic acids research, vol. 50, pp. D387– D390, Jan. 2022

work page 2022

[2] [2]

Genomic analysis in the age of human genome sequencing.,

T. Lappalainen, A. J. Scott, M. Brandt, and I. M. Hall, “Genomic analysis in the age of human genome sequencing.,” Cell, vol. 177, pp. 70–84, Mar. 2019

work page 2019

[3] [3]

A survey of best practices for RNA-seq data analysis.,

A. Conesa, P. Madrigal, S. Tarazona, D. Gomez-Cabrero, A. Cervera, A. McPherson, M. W. Szcze´ sniak, D. J. Gaffney, L. L. Elo, X. Zhang, and A. Mortazavi, “A survey of best practices for RNA-seq data analysis.,” Genome Biology, vol. 17, p. 13, Jan. 2016

work page 2016

[4] [4]

Shouji: a fast and efficient pre- alignment filter for sequence alignment.,

M. Alser, H. Hassan, A. Kumar, O. Mutlu, and C. Alkan, “Shouji: a fast and efficient pre- alignment filter for sequence alignment.,” Bioinformatics, vol. 35, pp. 4255–4263, Nov. 2019

work page 2019

[5] [5]

Parallelization of the banded Needleman & Wunsch algorithm on UPMEM PiM architecture for long DNA sequence alignment,

M. Mognol, D. Lavenier, and J. Legriel, “Parallelization of the banded Needleman & Wunsch algorithm on UPMEM PiM architecture for long DNA sequence alignment,” in Proceedings of the 53rd International Conference on Parallel Processing, ICPP 2024, Gotland, Sweden, August 12-15, 2024 , pp. 1062–1071, ACM, 2024

work page 2024

[6] [6]

A framework for high-throughput sequence alignment using real processing-in-memory systems.,

S. Diab, A. Nassereldine, M. Alser, J. G´ omez Luna, O. Mutlu, and I. El Hajj, “A framework for high-throughput sequence alignment using real processing-in-memory systems.,” Bioinfor- matics, vol. 39, May 2023

work page 2023

[7] [7]

MiMyCS: A processing-in-memory read mapper for compressing next-gen sequencing datasets,

F. D. Moor, M. Mognol, C. Deltel, E. Drezen, J. Legriel, and D. Lavenier, “MiMyCS: A processing-in-memory read mapper for compressing next-gen sequencing datasets,” (Lisbon, Portugal), pp. 6716–6723, IEEE, 2024

work page 2024

[8] [8]

GAPiM: Discovering genetic variations on a real processing-in-memory system,

N. Abecassis, J. G´ omez-Luna, O. Mutlu, R. Ginosar, A. Moisson-Franckhauser, and L. Yavits, “GAPiM: Discovering genetic variations on a real processing-in-memory system,” bioRxiv, 2023. 12

work page 2023

[9] [9]

CiMBA: Accelerating genome sequencing through on-device basecalling via compute-in-memory,

W. A. Simon, I. Boybat, R. Kodra, E. Ferro, G. Singh, M. Alser, S. Jain, H. Tsai, G. W. Burr, O. Mutlu, and A. Sebastian, “CiMBA: Accelerating genome sequencing through on-device basecalling via compute-in-memory,” IEEE Trans. Parallel Distributed Syst. , vol. 36, no. 6, pp. 1130–1145, 2025

work page 2025

[10] [10]

GCOC: A genome classifier-on-chip based on similarity search content addressable memory,

Y. Harary, P. Snapir, S. S. Tov, C. Kruphman, E. Rechef, Z. Jahshan, E. Garz´ on, and L. Yavits, “GCOC: A genome classifier-on-chip based on similarity search content addressable memory,” IEEE Transactions on Biomedical Circuits and Systems , 2024

work page 2024

[11] [11]

DIPER: Detection and identification of pathogens using edit distance-tolerant resistive cam,

I. Merlin, E. Garz´ on, A. Fish, and L. Yavits, “DIPER: Detection and identification of pathogens using edit distance-tolerant resistive cam,” IEEE Transactions on Computers , vol. 73, no. 10, pp. 2463–2473, 2023

work page 2023

[12] [12]

DASH-CAM: dynamic approximate search content addressable memory for genome classification,

Z. Jahshan, I. Merlin, E. Garz´ on, and L. Yavits, “DASH-CAM: dynamic approximate search content addressable memory for genome classification,” in Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 1453–1465, 2023

work page 2023

[13] [13]

MajorK: Majority based kmer matching in commodity dram,

Z. Jahshan and L. Yavits, “MajorK: Majority based kmer matching in commodity dram,” IEEE Computer Architecture Letters, 2024

work page 2024

[14] [14]

Energy efficiency impact of processing in memory: A comprehensive review of workloads on the UPMEM architecture,

Y. Falevoz and J. Legriel, “Energy efficiency impact of processing in memory: A comprehensive review of workloads on the UPMEM architecture,” in Euro-Par 2023: Parallel Processing Workshops (D. Zeinalipour, D. Blanco Heras, G. Pallis, H. Herodotou, D. Trihinas, D. Balouek, P. Diehl, T. Cojean, K. F¨ urlinger, M. H. Kirkeby, M. Nardelli, and P. Di Sanzo, e...

work page 2023

[15] [15]

Streaming algorithms for embedding and computing edit distance in the low distance regime,

D. Chakraborty, E. Goldenberg, and M. Kouck´ y, “Streaming algorithms for embedding and computing edit distance in the low distance regime,” in Proceedings of the forty-eighth annual ACM symposium on Theory of Computing , STOC ’16, (New York, NY, USA), pp. 712–725, Association for Computing Machinery, June 2016

work page 2016

[16] [16]

Cimba: Accelerating genome sequencing through on-device basecalling via compute-in-memory,

W. A. Simon, I. Boybat, R. Kodra, E. Ferro, G. Singh, M. Alser, S. Jain, H. Tsai, G. W. Burr, O. Mutlu, and A. Sebastian, “Cimba: Accelerating genome sequencing through on-device basecalling via compute-in-memory,” IEEE Transactions on Parallel and Distributed Systems , pp. 1–15, 2025

work page 2025

[17] [17]

Aihwkit-lightning: a scalable hw-aware training toolkit for analog in-memory com- puting,

J. B¨ uchel, W. A. Simon, C. Lammie, G. Acampa, K. El Maghraoui, M. Le Gallo, and A. Se- bastian, “Aihwkit-lightning: a scalable hw-aware training toolkit for analog in-memory com- puting,” in NeurIPS 2024 Workshop Machine Learning with new Compute Paradigms , 2024

work page 2024

[18] [18]

MegIS: High-performance, energy-efficient, and low-cost metagenomic analysis with in-storage processing,

N. M. Ghiasi, M. Sadrosadati, H. Mustafa, A. Gollwitzer, C. Firtina, J. Eudine, H. Mao, J. Lindegger, M. B. Cavlak, M. Alser, J. Park, and O. Mutlu, “MegIS: High-performance, energy-efficient, and low-cost metagenomic analysis with in-storage processing,” in Proceedings of the 51st Annual International Symposium on Computer Architecture (ISCA), (Los Alami...

work page 2024

[19] [19]

Flash-cosmos: In-flash bulk bitwise operations using inherent computation capability of nand flash memory,

J. Park, R. Azizi, G. F. Oliveira, M. Sadrosadati, R. Nadig, D. Novo, J. G´ omez-Luna, M. Kim, and O. Mutlu, “Flash-cosmos: In-flash bulk bitwise operations using inherent computation capability of nand flash memory,” 2022. 13

work page 2022

[20] [20]

Venice: Improving solid-state drive parallelism at low cost via conflict-free accesses,

R. Nadig, M. Sadrosadati, H. Mao, N. M. Ghiasi, A. Tavakkol, J. Park, H. Sarbazi-Azad, J. G. Luna, and O. Mutlu, “Venice: Improving solid-state drive parallelism at low cost via conflict-free accesses,” in Proceedings of the 50th Annual International Symposium on Com- puter Architecture, ISCA ’23, (New York, NY, USA), Association for Computing Machinery, 2023

work page 2023

[21] [21]

PIM-AI: A novel architecture for high-efficiency LLM inference,

C. Ortega, Y. Falevoz, and R. Ayrignac, “PIM-AI: A novel architecture for high-efficiency LLM inference,” 2024. 14

work page 2024