pith. sign in

arxiv: 1709.01507 · v4 · pith:7BYNRDQRnew · submitted 2017-09-05 · 💻 cs.CV

Squeeze-and-Excitation Networks

classification 💻 cs.CV
keywords networksspatialsqueeze-and-excitationblockblockschannel-wisecnnsfeature
0
0 comments X
read the original abstract

The central building block of convolutional neural networks (CNNs) is the convolution operator, which enables networks to construct informative features by fusing both spatial and channel-wise information within local receptive fields at each layer. A broad range of prior research has investigated the spatial component of this relationship, seeking to strengthen the representational power of a CNN by enhancing the quality of spatial encodings throughout its feature hierarchy. In this work, we focus instead on the channel relationship and propose a novel architectural unit, which we term the "Squeeze-and-Excitation" (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels. We show that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets. We further demonstrate that SE blocks bring significant improvements in performance for existing state-of-the-art CNNs at slight additional computational cost. Squeeze-and-Excitation Networks formed the foundation of our ILSVRC 2017 classification submission which won first place and reduced the top-5 error to 2.251%, surpassing the winning entry of 2016 by a relative improvement of ~25%. Models and code are available at https://github.com/hujie-frank/SENet.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 13 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Polarized Target Nuclear Magnetic Resonance Measurements with Deep Neural Networks

    physics.ins-det 2026-03 unverdicted novelty 7.0

    Deep neural networks reduce fitting uncertainties in CW-NMR polarization measurements for dynamically polarized targets.

  2. When Less Is More: Simplicity Beats Complexity for Physics-Constrained InSAR Phase Unwrapping

    cs.CV 2026-04 accept novelty 6.0

    A vanilla U-Net with 7.76M parameters achieves R²=0.834 and RMSE=1.01 cm on a global InSAR benchmark, beating larger attention models by 34% in R² and 51% in RMSE while running 2.5× faster.

  3. Hybrid-Attention based Decoupled Metric Learning for Zero-Shot Image Retrieval

    cs.CV 2019-07 unverdicted novelty 6.0

    Introduces hybrid-attention decoupled metric learning to prevent partial learning and improve generalization in zero-shot image retrieval, claiming significant gains over prior methods.

  4. Interaction-and-Aggregation Network for Person Re-identification

    cs.CV 2019-07 unverdicted novelty 6.0

    Introduces IA network with SIA and CIA modules to adaptively model spatial and channel feature interdependencies for improved person re-identification on benchmarks.

  5. Learning Data Augmentation Strategies for Object Detection

    cs.CV 2019-06 unverdicted novelty 6.0

    Learned data augmentation policies optimized for object detection improve COCO mAP by more than 2.3 and transfer to other datasets and models.

  6. Attention U-Net: Learning Where to Look for the Pancreas

    cs.CV 2018-04 unverdicted novelty 6.0

    Attention gates added to U-Net automatically focus on target organs in CT images and improve segmentation performance on abdominal datasets.

  7. Personalized Face Privacy Protection From a Single Image

    cs.CV 2026-05 unverdicted novelty 5.0

    FaceCloak learns a lightweight identity-specific cloaking mask from a single image via synthetic face generation and iterative embedding perturbation to evade multiple recognition models.

  8. Heuristic Style Transfer for Real-Time, Efficient Weather Attribute Detection

    cs.CV 2026-04 conditional novelty 5.0

    Lightweight multi-task models using Gram matrices and PatchGAN-style architectures detect 53 weather classes from RGB images with F1 scores above 96% internally and 78% zero-shot externally, supported by a new 503k-im...

  9. ReST: A Plug-and-Play Spatially-Constrained Representation Enhancement Framework for Local-Life Recommendation

    cs.IR 2025-11 unverdicted novelty 5.0

    ReST enhances long-tail item representations for spatially constrained local-life recommendations via a Meta ID Warm-up Network and a contrastive SIDENet with hard sampling and dynamic alignment strategies.

  10. FPCNet: Fast Pavement Crack Detection Network Based on Encoder-Decoder Architecture

    cs.CV 2019-07 unverdicted novelty 5.0

    FPCNet uses an encoder-decoder architecture with MD and SEU modules to learn multi-context crack features and achieves faster, more accurate pixel-level detection than prior methods on CFD and G45 datasets.

  11. Resource-Efficient CSI Prediction: A Gated Fusion and Factorized Projection Approach

    eess.SP 2026-05 unverdicted novelty 4.0

    A gated-fusion CSI predictor using GRU, attention, and DSLH reaches -13.84 dB NMSE with 26% fewer parameters and 2.3x higher throughput than a LinFormer baseline on 3GPP channels.

  12. A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence

    cs.LG 2026-04 unverdicted novelty 4.0

    A conditional Wasserstein GAN generates plausible future SWI drought trajectories for French insurance risk management under climate change.

  13. Improved Reinforcement Learning through Imitation Learning Pretraining Towards Image-based Autonomous Driving

    cs.LG 2019-07 unverdicted novelty 3.0

    Imitation learning pretraining of a ResNet-34 DDPG agent improves performance on image-based autonomous driving in simulation over pure IL or pure RL.