DISCA achieves 3.59 TOPS/W per bit energy efficiency for matrix multiplication at 500 MHz in 180 nm CMOS using a compressed Bent-Pyramid stochastic format.
Compute-in-eDRAM with Backend Integrated Indium Gallium Zinc Oxide Transistors,
5 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.AR 5representative citing papers
PLENA introduces a co-designed system with three optimization pathways for long-context agentic LLM inference, claiming up to 2.23x throughput over A100 and 4.04x energy efficiency.
Overview chapter surveying volatile and non-volatile memories including SRAM, DRAM, RRAM, MRAM, FeFET and cryogenic JJFET devices, with focus on principles, tradeoffs, and challenges.
citing papers explorer
-
DISCA: A Digital In-memory Stochastic Computing Architecture Using A Compressed Bent-Pyramid Format
DISCA achieves 3.59 TOPS/W per bit energy efficiency for matrix multiplication at 500 MHz in 180 nm CMOS using a compressed Bent-Pyramid stochastic format.
-
Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference
PLENA introduces a co-designed system with three optimization pathways for long-context agentic LLM inference, claiming up to 2.23x throughput over A100 and 4.04x energy efficiency.
-
Emerging memory technologies at room/cryogenic temperature
Overview chapter surveying volatile and non-volatile memories including SRAM, DRAM, RRAM, MRAM, FeFET and cryogenic JJFET devices, with focus on principles, tradeoffs, and challenges.
- A complete discussion on fully reconfigurable, digital, scalable, graph and sparsity-aware near-memory accelerator for graph neural networks
- ABI: A tightly integrated, unified, sparsity-aware, reconfigurable, compute near-register file/cache GPU architecture with light-weight softmax for deep learning, linear algebra, and Ising compute