The Processing Using Memory Paradigm:In-DRAM Bulk Copy, Initialization, Bitwise AND and OR
read the original abstract
In existing systems, the off-chip memory interface allows the memory controller to perform only read or write operations. Therefore, to perform any operation, the processor must first read the source data and then write the result back to memory after performing the operation. This approach consumes high latency, bandwidth, and energy for operations that work on a large amount of data. Several works have proposed techniques to process data near memory by adding a small amount of compute logic closer to the main memory chips. In this article, we describe two techniques proposed by recent works that take this approach of processing in memory further by exploiting the underlying operation of the main memory technology to perform more complex tasks. First, we describe RowClone, a mechanism that exploits DRAM technology to perform bulk copy and initialization operations completely inside main memory. We then describe a complementary work that uses DRAM to perform bulk bitwise AND and OR operations inside main memory. These two techniques significantly improve the performance and energy efficiency of the respective operations.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
HE-PIM: Demystifying Homomorphic Operations on a Real-world Processing-in-Memory System
Characterization of HE kernels on commercial UPMEM PIM identifies modular multiplication and per-bank capacity as dominant bottlenecks and concludes PIM becomes competitive with CPU/GPU once those are addressed.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.