Hadamard-domain write-and-verify (HD-PV and HARP) reduces uncorrelated read-noise variance by a factor of N in RRAM, preserving ML accuracy under severe noise with up to 6.1x lower latency and 9.5x better energy efficiency than conventional methods.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4roles
background 1polarities
background 1representative citing papers
PAS-Net is a fully multiplier-free spiking neural network that enforces human joint constraints spatially and uses causal neuromodulation temporally to achieve state-of-the-art accuracy on IMU HAR with up to 98% lower dynamic energy via early-exit.
TrilinearCIM enables complete in-memory Transformer attention computation via DG-FeFET three-operand MAC without runtime NVM reprogramming, delivering up to 46.6% energy reduction and 20.4% latency improvement on BERT and ViT benchmarks at 37.3% area cost.
DeepStack introduces a fast performance model and hierarchical search method for co-optimizing 3D DRAM stacking, interconnects, and distributed scheduling in AI accelerators, delivering up to 9.5x throughput gains over baselines.
citing papers explorer
-
HARP: Hadamard-Domain Write-and-Verify for Noise-Robust RRAM Programming
Hadamard-domain write-and-verify (HD-PV and HARP) reduces uncorrelated read-noise variance by a factor of N in RRAM, preserving ML accuracy under severe noise with up to 6.1x lower latency and 9.5x better energy efficiency than conventional methods.
-
Towards Green Wearable Computing: A Physics-Aware Spiking Neural Network for Energy-Efficient IMU-based Human Activity Recognition
PAS-Net is a fully multiplier-free spiking neural network that enforces human joint constraints spatially and uses causal neuromodulation temporally to achieve state-of-the-art accuracy on IMU HAR with up to 98% lower dynamic energy via early-exit.
-
Trilinear Compute-in-Memory Architecture for Energy-Efficient Transformer Acceleration
TrilinearCIM enables complete in-memory Transformer attention computation via DG-FeFET three-operand MAC without runtime NVM reprogramming, delivering up to 46.6% energy reduction and 20.4% latency improvement on BERT and ViT benchmarks at 37.3% area cost.
-
DeepStack: Scalable and Accurate Design Space Exploration for Distributed 3D-Stacked AI Accelerators
DeepStack introduces a fast performance model and hierarchical search method for co-optimizing 3D DRAM stacking, interconnects, and distributed scheduling in AI accelerators, delivering up to 9.5x throughput gains over baselines.