Enhancing Layer Attention Efficiency through Pruning Redundant Retrievals

Hanze Li; Mengyao Zeng; Xiande Huang; Xiuqi Ge; Yaosong Du; Zhibo Yao

arxiv: 2503.06473 · v5 · pith:U3ZDS7XPnew · submitted 2025-03-09 · 💻 cs.CV · cs.AI

Enhancing Layer Attention Efficiency through Pruning Redundant Retrievals

Hanze Li , Yaosong Du , Zhibo Yao , Mengyao Zeng , Xiuqi Ge , Xiande Huang This is my paper

classification 💻 cs.CV cs.AI

keywords attentionlayerslayerredundancytrainingadjacentefficiencyenhancing

0 comments

read the original abstract

Growing evidence suggests that layer attention mechanisms, which enhance interaction among layers in deep neural networks, have significantly advanced network architectures. However, existing layer attention methods suffer from redundancy, as attention weights learned by adjacent layers often become highly similar. This redundancy causes multiple layers to extract nearly identical features, reducing the model's representational capacity and increasing training time. To address this issue, we propose a novel approach to quantify redundancy by leveraging the Kullback-Leibler (KL) divergence between adjacent layers. Additionally, we introduce an Enhanced Beta Quantile Mapping (EBQM) method that accurately identifies and skips redundant layers, thereby maintaining model stability. Our proposed Efficient Layer Attention (ELA) architecture, improves both training efficiency and overall performance, achieving a 30% reduction in training time while enhancing performance in tasks such as image classification and object detection.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Enhancing Layer Interaction Using Key-Correlated Layer Attention
cs.CV 2026-06 unverdicted novelty 5.0

KCLA is a linear-complexity layer attention mechanism that exploits high key cosine similarity to preserve dynamic updates and long-range cross-layer connections.