FusionCIM is a fusion-driven CIM accelerator for LLM inference that maps QKT to IP-CIM and PV to OP-CIM, uses QO-stationary dataflow, and applies pattern-aware online softmax, delivering up to 3.86x energy savings and 1.98x speedup on LLaMA-3 at 29.4 TOPS/W.
The design process for google’s training chips: Tpuv2 and tpuv3
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.AR 2years
2026 2representative citing papers
FireBridge enables cycle-accurate hardware-firmware co-verification in standard simulators using randomized memory bridges, delivering up to 50x faster debug iterations than FPGA-based flows for accelerators such as systolic arrays and CGRAs.
citing papers explorer
-
FusionCIM: Accelerating LLM Inference with Fusion-Driven Computing-in-Memory Architecture
FusionCIM is a fusion-driven CIM accelerator for LLM inference that maps QKT to IP-CIM and PV to OP-CIM, uses QO-stationary dataflow, and applies pattern-aware online softmax, delivering up to 3.86x energy savings and 1.98x speedup on LLaMA-3 at 29.4 TOPS/W.
-
FireBridge: Cycle-Accurate Hardware + Firmware Co-Verification for Modern Accelerators
FireBridge enables cycle-accurate hardware-firmware co-verification in standard simulators using randomized memory bridges, delivering up to 50x faster debug iterations than FPGA-based flows for accelerators such as systolic arrays and CGRAs.