Topkima-former: Low-energy, low-latency inference for transformers using top-k in-memory adc.IEEE Transactions on Circuits and Systems I: Regular Papers, 2025

Shuai Dong, Junyi Yang, Xiaoqi Peng, Hongyang Shang, Ye Ke, Xiaofeng Yang, Hongjie Liu, Arindam Basu · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

FAR: Function-preserving Attention Replacement for IMC-friendly Inference

cs.CV · 2025-05-24 · unverdicted · novelty 5.0

FAR substitutes self-attention in pretrained DeiTs with multi-head bidirectional LSTMs via block-wise distillation and structured pruning to enable IMC-compatible inference with comparable accuracy and lower latency.

citing papers explorer

Showing 1 of 1 citing paper.

FAR: Function-preserving Attention Replacement for IMC-friendly Inference cs.CV · 2025-05-24 · unverdicted · none · ref 43
FAR substitutes self-attention in pretrained DeiTs with multi-head bidirectional LSTMs via block-wise distillation and structured pruning to enable IMC-compatible inference with comparable accuracy and lower latency.

Topkima-former: Low-energy, low-latency inference for transformers using top-k in-memory adc.IEEE Transactions on Circuits and Systems I: Regular Papers, 2025

fields

years

verdicts

representative citing papers

citing papers explorer