Neural cache: Bit-serial in-cache acceleration of deep neural networks,

· 2018

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

GEM3D CIM General Purpose Matrix Computation Using 3D Integrated SRAM eDRAM Hybrid Compute In Memory on Memory Architecture

cs.AR · 2026-04-15 · unverdicted · novelty 7.0

A 3D SRAM-eDRAM hybrid CIM design in 22nm FDSOI enables general-purpose matrix computations beyond dot products with claimed balance of latency, energy, and density.

AQPIM: Breaking the PIM Capacity Wall for LLMs with In-Memory Activation Quantization

cs.AR · 2026-04-20 · unverdicted · novelty 6.0

AQPIM performs in-memory product quantization of activations for LLMs on PIM hardware, reducing GPU-CPU communication by 90-98.5% and delivering 3.4x speedup over prior PIM methods.

A comprehensive study on ILP acceleration accounting for sparsity, area, energy, data movement using near-memory architecture

cs.AR · 2026-05-16 · unverdicted · novelty 5.0

SPARK is a sparsity-aware near-cache ILP accelerator that reuses L1 cache structures to deliver up to 15x speedup and 152x energy reduction versus CPUs on sparse MIPLIB workloads with 1.4% area overhead.

citing papers explorer

Showing 3 of 3 citing papers.

GEM3D CIM General Purpose Matrix Computation Using 3D Integrated SRAM eDRAM Hybrid Compute In Memory on Memory Architecture cs.AR · 2026-04-15 · unverdicted · none · ref 21
A 3D SRAM-eDRAM hybrid CIM design in 22nm FDSOI enables general-purpose matrix computations beyond dot products with claimed balance of latency, energy, and density.
AQPIM: Breaking the PIM Capacity Wall for LLMs with In-Memory Activation Quantization cs.AR · 2026-04-20 · unverdicted · none · ref 12
AQPIM performs in-memory product quantization of activations for LLMs on PIM hardware, reducing GPU-CPU communication by 90-98.5% and delivering 3.4x speedup over prior PIM methods.
A comprehensive study on ILP acceleration accounting for sparsity, area, energy, data movement using near-memory architecture cs.AR · 2026-05-16 · unverdicted · none · ref 18
SPARK is a sparsity-aware near-cache ILP accelerator that reuses L1 cache structures to deliver up to 15x speedup and 152x energy reduction versus CPUs on sparse MIPLIB workloads with 1.4% area overhead.

Neural cache: Bit-serial in-cache acceleration of deep neural networks,

fields

years

verdicts

representative citing papers

citing papers explorer