Training Deeper Convolutional Networks with Deep Supervision

Chen-Yu Lee; Liwei Wang; Svetlana Lazebnik; Zhuowen Tu

arxiv: 1505.02496 · v1 · pith:GPRO7CM3new · submitted 2015-05-11 · 💻 cs.CV

Training Deeper Convolutional Networks with Deep Supervision

Liwei Wang , Chen-Yu Lee , Zhuowen Tu , Svetlana Lazebnik This is my paper

classification 💻 cs.CV

keywords trainingconvolutionallayersnetworksbranchesdeepdeepermakes

0 comments

read the original abstract

One of the most promising ways of improving the performance of deep convolutional neural networks is by increasing the number of convolutional layers. However, adding layers makes training more difficult and computationally expensive. In order to train deeper networks, we propose to add auxiliary supervision branches after certain intermediate layers during training. We formulate a simple rule of thumb to determine where these branches should be added. The resulting deeply supervised structure makes the training much easier and also produces better classification results on ImageNet and the recently released, larger MIT Places dataset

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

OmniISR: A Unified Framework for Centralized and Federated Learning via Intermediate Supervision and Regularization
cs.LG 2026-05 unverdicted novelty 6.0

OmniISR unifies centralized, federated, and hybrid learning by injecting mutual-information supervision and negative-entropy regularization at multiple hidden layers, with supporting convergence and drift bounds.
Temporal Aware Pruning for Efficient Diffusion-based Video Generation
cs.CV 2026-05 unverdicted novelty 6.0

TAPE introduces temporal-aware token pruning for diffusion-based video generation, using frame smoothing, layer reselection, and timestep budgets to achieve speedups while maintaining visual fidelity and coherence.
Temporal Aware Pruning for Efficient Diffusion-based Video Generation
cs.CV 2026-05 unverdicted novelty 5.0

TAPE applies temporal-aware token pruning with smoothing, reselection, and timestep scheduling to speed up video diffusion models while preserving visual fidelity and coherence.