SLIM adapts MoGe-2 to truly sparse LiDAR via partial-convolution encoder and multi-scale fusion neck, cutting absolute relative depth error by 39-51% at 100-150 m on Virtual KITTI and CARLA under density-agnostic training.
hub
Squeeze-and-Excitation Networks
16 Pith papers cite this work. Polarity classification is still indexing.
abstract
The central building block of convolutional neural networks (CNNs) is the convolution operator, which enables networks to construct informative features by fusing both spatial and channel-wise information within local receptive fields at each layer. A broad range of prior research has investigated the spatial component of this relationship, seeking to strengthen the representational power of a CNN by enhancing the quality of spatial encodings throughout its feature hierarchy. In this work, we focus instead on the channel relationship and propose a novel architectural unit, which we term the "Squeeze-and-Excitation" (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels. We show that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets. We further demonstrate that SE blocks bring significant improvements in performance for existing state-of-the-art CNNs at slight additional computational cost. Squeeze-and-Excitation Networks formed the foundation of our ILSVRC 2017 classification submission which won first place and reduced the top-5 error to 2.251%, surpassing the winning entry of 2016 by a relative improvement of ~25%. Models and code are available at https://github.com/hujie-frank/SENet.
hub tools
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
Deep neural networks reduce fitting uncertainties in CW-NMR polarization measurements for dynamically polarized targets.
Introduces hybrid-attention decoupled metric learning to prevent partial learning and improve generalization in zero-shot image retrieval, claiming significant gains over prior methods.
Introduces IA network with SIA and CIA modules to adaptively model spatial and channel feature interdependencies for improved person re-identification on benchmarks.
Learned data augmentation policies optimized for object detection improve COCO mAP by more than 2.3 and transfer to other datasets and models.
A vanilla U-Net with 7.76M parameters achieves R²=0.834 and RMSE=1.01 cm on a global InSAR benchmark, beating larger attention models by 34% in R² and 51% in RMSE while running 2.5× faster.
Attention gates added to U-Net automatically focus on target organs in CT images and improve segmentation performance on abdominal datasets.
SORA is an adaptive step-size adversarial training algorithm that formalizes epsilon overfitting, introduces the PertAlign metric to predict catastrophic overfitting, and dynamically adjusts perturbations to achieve state-of-the-art robustness and clean accuracy with fixed hyperparameters.
FaceCloak learns a lightweight identity-specific cloaking mask from a single image via synthetic face generation and iterative embedding perturbation to evade multiple recognition models.
ReST enhances long-tail item representations for spatially constrained local-life recommendations via a Meta ID Warm-up Network and a contrastive SIDENet with hard sampling and dynamic alignment strategies.
FPCNet uses an encoder-decoder architecture with MD and SEU modules to learn multi-context crack features and achieves faster, more accurate pixel-level detection than prior methods on CFD and G45 datasets.
Lightweight multi-task models using Gram matrices and PatchGAN-style architectures detect 53 weather classes from RGB images with F1 scores above 96% internally and 78% zero-shot externally, supported by a new 503k-image dataset.
A gated-fusion CSI predictor using GRU, attention, and DSLH reaches -13.84 dB NMSE with 26% fewer parameters and 2.3x higher throughput than a LinFormer baseline on 3GPP channels.
A conditional Wasserstein GAN generates plausible future SWI drought trajectories for French insurance risk management under climate change.
EfficientNetB5 with CBAM reaches 93.3% accuracy on a 1,366-image peach leaf damage dataset and EfficientNetB3 with CBAM reaches 93% macro F1 after transfer to a 180-image local domain.
Imitation learning pretraining of a ResNet-34 DDPG agent improves performance on image-based autonomous driving in simulation over pure IL or pure RL.
citing papers explorer
-
Sparse-LiDAR Prompting of Monocular Geometry Foundations: An Empirical Study Toward Long-Range Driving Depth
SLIM adapts MoGe-2 to truly sparse LiDAR via partial-convolution encoder and multi-scale fusion neck, cutting absolute relative depth error by 39-51% at 100-150 m on Virtual KITTI and CARLA under density-agnostic training.
-
Polarized Target Nuclear Magnetic Resonance Measurements with Deep Neural Networks
Deep neural networks reduce fitting uncertainties in CW-NMR polarization measurements for dynamically polarized targets.
-
Hybrid-Attention based Decoupled Metric Learning for Zero-Shot Image Retrieval
Introduces hybrid-attention decoupled metric learning to prevent partial learning and improve generalization in zero-shot image retrieval, claiming significant gains over prior methods.
-
Interaction-and-Aggregation Network for Person Re-identification
Introduces IA network with SIA and CIA modules to adaptively model spatial and channel feature interdependencies for improved person re-identification on benchmarks.
-
Learning Data Augmentation Strategies for Object Detection
Learned data augmentation policies optimized for object detection improve COCO mAP by more than 2.3 and transfer to other datasets and models.
-
Attention U-Net: Learning Where to Look for the Pancreas
Attention gates added to U-Net automatically focus on target organs in CT images and improve segmentation performance on abdominal datasets.
-
SORA: Free Second-Order Attacks in Fast Adversarial Training
SORA is an adaptive step-size adversarial training algorithm that formalizes epsilon overfitting, introduces the PertAlign metric to predict catastrophic overfitting, and dynamically adjusts perturbations to achieve state-of-the-art robustness and clean accuracy with fixed hyperparameters.
-
Personalized Face Privacy Protection From a Single Image
FaceCloak learns a lightweight identity-specific cloaking mask from a single image via synthetic face generation and iterative embedding perturbation to evade multiple recognition models.
-
ReST: A Plug-and-Play Spatially-Constrained Representation Enhancement Framework for Local-Life Recommendation
ReST enhances long-tail item representations for spatially constrained local-life recommendations via a Meta ID Warm-up Network and a contrastive SIDENet with hard sampling and dynamic alignment strategies.
-
FPCNet: Fast Pavement Crack Detection Network Based on Encoder-Decoder Architecture
FPCNet uses an encoder-decoder architecture with MD and SEU modules to learn multi-context crack features and achieves faster, more accurate pixel-level detection than prior methods on CFD and G45 datasets.
-
Resource-Efficient CSI Prediction: A Gated Fusion and Factorized Projection Approach
A gated-fusion CSI predictor using GRU, attention, and DSLH reaches -13.84 dB NMSE with 26% fewer parameters and 2.3x higher throughput than a LinFormer baseline on 3GPP channels.
-
A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence
A conditional Wasserstein GAN generates plausible future SWI drought trajectories for French insurance risk management under climate change.
-
Attention mechanisms and transfer learning for robust peach leaf damage classification under domain shift
EfficientNetB5 with CBAM reaches 93.3% accuracy on a 1,366-image peach leaf damage dataset and EfficientNetB3 with CBAM reaches 93% macro F1 after transfer to a 180-image local domain.
-
Improved Reinforcement Learning through Imitation Learning Pretraining Towards Image-based Autonomous Driving
Imitation learning pretraining of a ResNet-34 DDPG agent improves performance on image-based autonomous driving in simulation over pure IL or pure RL.