Densely Connected Convolutional Networks

Gao Huang; Kilian Q. Weinberger; Laurens van der Maaten; Zhuang Liu

arxiv: 1608.06993 · v5 · pith:QKTGCOZOnew · submitted 2016-08-25 · 💻 cs.CV · cs.LG

Densely Connected Convolutional Networks

Gao Huang , Zhuang Liu , Laurens van der Maaten , Kilian Q. Weinberger This is my paper

classification 💻 cs.CV cs.LG

keywords layerconvolutionallayersconnectionsnetworksclosedensenetdensenets

0 comments

read the original abstract

Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less computation to achieve high performance. Code and pre-trained models are available at https://github.com/liuzhuang13/DenseNet .

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 19 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling
cs.LG 2025-09 unverdicted novelty 7.0

DPQuant uses epoch-wise probabilistic layer rotation and DP loss sensitivity to quantize only a changing subset of layers, reducing accuracy degradation from quantization noise in DP-SGD and delivering up to 2.21x thr...
Unsupervised Source-Free Ranking of Biomedical Segmentation Models Under Distribution Shift
cs.CV 2025-03 unverdicted novelty 7.0

Presents the first unsupervised source-free framework for ranking semantic and instance segmentation models via prediction consistency under perturbations, with rankings correlating to target-domain performance across...
Mistake gating leads to energy and memory efficient continual learning
cs.AI 2026-04 unverdicted novelty 6.0

Mistake-gated plasticity reduces neural network updates by 50-80% by gating changes on classification errors, improving efficiency for continual learning without added hyperparameters.
A general framework for knowledge integration in machine learning for electromagnetic scattering using quasinormal modes
physics.optics 2025-09 unverdicted novelty 6.0

A universal physics-informed neural network framework for electromagnetic scattering based on quasinormal mode expansion that guarantees compliance with energy conservation and causality and shows improved data effici...
Towards Robust Voice Pathology Detection
cs.SD 2019-07 unverdicted novelty 6.0

Exploratory experiments combining four voice databases to evaluate XGBoost, DenseNet, and Isolation Forest on raw waveforms, spectrograms, MFCCs, and acoustic features for pathology detection, with peak F1 of 0.733.
A Multitask Network for Localization and Recognition of Text in Images
cs.CL 2019-06 unverdicted novelty 6.0

Presents an end-to-end multitask CNN with FPN, dynamic RoI pooling, and convolutional attention for simultaneous lexicon-free text localization and recognition in complex images.
SGDR: Stochastic Gradient Descent with Warm Restarts
cs.LG 2016-08 accept novelty 6.0

SGDR uses periodic warm restarts of the learning rate in SGD to reach new state-of-the-art error rates of 3.14% on CIFAR-10 and 16.21% on CIFAR-100.
Physics-informed convolutional neural networks for fluid flow through porous media
cs.LG 2026-05 unverdicted novelty 5.0

A physics-informed CNN predicts pore-scale velocity fields from geometry and serves as a warm-start to accelerate Lattice-Boltzmann solvers in over 90% of tested cases.
Non-identifiability of Explanations from Model Behavior in Deep Networks of Image Authenticity Judgments
cs.CV 2026-04 unverdicted novelty 5.0

Models predicting human authenticity judgments produce inconsistent attribution maps across architectures, showing that explanations are non-identifiable.
Attention Residuals
cs.CL 2026-03 unverdicted novelty 5.0

Attention Residuals replaces fixed residual summation with input-dependent softmax attention over preceding layers, and a blocked variant is shown to improve uniformity and downstream performance in a 48B-parameter mo...
HLGFA: High-Low Resolution Guided Feature Alignment for Unsupervised Anomaly Detection
cs.CV 2026-02 unverdicted novelty 5.0

HLGFA detects anomalies by identifying breakdowns in cross-resolution feature consistency between high- and low-resolution views of normal samples, guided by structure and detail priors, and reports 97.9% pixel AUROC ...
Detection of Lensed Gravitational Waves in the Millihertz Band Using Frequency-Domain Lensing Feature Extraction Network
astro-ph.IM 2025-12 unverdicted novelty 5.0

DCL-xLSTM neural network detects lensed GW events with AUC over 0.99 using training on PM and SIS lens models in the millihertz band.
Learning Multimodal Fixed-Point Weights using Gradient Descent
cs.LG 2019-07 unverdicted novelty 5.0

Gradient-based optimization learns symmetric Gaussian mixture modes for 2-bit fixed-point weight quantization, claiming state-of-the-art performance and self-adaptive weights.
Training Neural Networks with Optimal Double-Bayesian Learning
cs.LG 2026-05 unverdicted novelty 4.0

A double-Bayesian framework derives an optimal learning rate for neural network training via two antagonistic Bayesian processes.
PR3DICTR: A modular AI framework for medical 3D image-based detection and outcome prediction
cs.CV 2026-04 unverdicted novelty 4.0

PR3DICTR is a new open-access modular framework for 3D medical image classification and outcome prediction that works with as little as two lines of code.
Preparation of Fractal-Inspired Computational Architectures for Advanced Large Language Model Analysis
cs.LG 2025-11 unverdicted novelty 4.0

FractalNet automatically generates and tests over 1,200 CNN architectures based on recursive fractal templates, achieving up to 80.18% accuracy on CIFAR-10 after five training epochs.
AMD Severity Prediction And Explainability Using Image Registration And Deep Embedded Clustering
cs.CV 2019-07 unverdicted novelty 4.0

A method using deep image registration and embedded clustering predicts AMD severity from OCT images with classification performance matching state-of-the-art and improved explainability via registration outputs.
Multi-Gate Residuals
cs.LG 2026-05 unverdicted novelty 3.0

Multi-Gate Residuals stabilizes activation scales in deep residual networks via multi-stream gating and attention pooling without added communication overhead.
Preparation of Fractal-Inspired Computational Architectures for Advanced Large Language Model Analysis
cs.LG 2025-11 unverdicted novelty 3.0

Fractal templates enable systematic creation of more than 1,200 neural network variants that show strong performance and computational efficiency when trained on CIFAR-10 for five epochs.